Phylogenetic Networks with Recombination
|
|
- Dana Poole
- 5 years ago
- Views:
Transcription
1 Phylogenetic Networks with Recombination October
2 Recombination All DNA is recombinant DNA... [The] natural process of recombination and mutation have acted throughout evolution... Genetic exchange works constantly to blend and rearrange chromosomes, most obviously during meiosis... J. Watson We are interested in reconstructing the history of mutations and recombinations creating observed SNP (binary) sequences. Our thesis is that recombination networks can be constructed by efficient algorithms using genome variation data in populations, and that those networks reflect true recombination history sufficiently well to help resolve or clarify many biological issues.
3 Crossing Over The best understood form of recombination, occurring during every meiosis, is single-crossover recombination, also called crossing-over parent parent recombinant Figure: A single-crossover recombination. The prefix (underlined) contributed by parental sequence 1 consists of the first three characters of SNP sequence 1. The suffix (underlined) contributed by parental sequence 2 consists of the last two characters of SNP sequence 2.
4 P S Figure: A graphical representation of a single-crossover recombination event. P contributes the prefix, and S contributes the suffix of the recombinant sequence. The crossover point is written above the recombination node.
5 SNP sequences The input consists of binary sequences modeling SNP sequences. The SNP sites are linearly ordered, and together on a single chromosome, but are generally not physically contiguous. We follow the infinite sites model: a SNP site mutates exactly once in the history of the sequences. SNP breakpoint
6 Genealogical Network and Ancestral Recombination Graph a: S b: c: P P 5 4 M a: b: c: d: e: f: g: S g: d: e: f: Figure: An ARG N with two recombination nodes, and the matrix of sequences M that are derived by N.
7 Genealogical Network and Ancestral Recombination Graph S P P S P S Figure: An ARG N with three recombination nodes
8 Two parts of this talk 1. Idealized Association Mapping using ARGs - a present and future application. 2. The phenomena of invisible and Steiner nodes in ARGs and in cluster-based phylogenetic networks.
9 Association Mapping: an example of the use of ARGs We examine a very idealized example in order to illustrate the logic. Definition In a pure-mendelian disease, there is a single (causal) site c in the genome, and a single causal state (say 1) for c, such that any individual in the population will have the disease if and only if they have state 1 at site c. We have a sample of known diseased individuals (Cases) and of non-diseased individuals (Controls), and we have a binary (SNP) sequence for each sampled individual.
10 a: S v P b: c: x P 5 4 S g: d: e: f: Diseased individuals Figure: The true ARG and the disease status of the individuals. We want to determine where site c is in the genome, and when the mutation happened there.
11 The first thing we can deduce is that the mutation occurs during the time represented by the edge labeled 2. The location of c can then be deduced as follows. individuals d: f: deduced interval SNP sites: Figure: The intervals (deduced from individuals d and f ) where c might be located.
12 Where is c? From the non-diseased d we deduce that c is after site 2, and from the diseased f we deduce that it is before site 4. Hence, we conclude that c is in the open interval (2,4). That is the finest deduction possible that is consistent with the data and ARG.
13 Association Mapping Summary If the true ARG were known, it would provide the optimal amount of information for mapping no extra information would be available from the genotypes. Not only would disease-associated regions be identified, but the ARG would give the ages of the causative mutations. [?]
14 Invisible and Steiner Nodes The need for Invisible nodes is a key technical problems in both ARGs and cluster-based phylogenetic networks. Steiner nodes are an additional problem for ARGs. What do we know about invisible and Steiner nodes? Four types of results. 1. The History Lower Bound and Rec-Invisible Nodes Given a set of sequences M, the History Bound of Myers and Griffiths is a lower bound on the number of needed recombinations in any ARG that creates M (with all-zero ancestral sequence). It is also a lower bound on the number of reticulation nodes in any softwired phylogenetic network for an input set of clusters. It is defined only by the algorithms that compute it.
15 The computation of the history bound uses three Rules Initially, set M to the input M. As the algorithm proceeds, rows and columns of M will be deleted. Let M denote the current remaining submatrix of M as the algorithm executes. The algorithm executes three Rules. The first two are: Rule Dc: If a column c of M contains at most one entry with value 1, then remove column c from M. Rule Dr: If two rows in M are identical, remove one.
16 Algorithm Clean(M) Execute Rules Dc and Dr on M in any order until no further applications of Rules Dc or Dr are possible. Set M to M. Note that the execution of Rule Dc may create the conditions where Rule Dr applies, and the converse is also true. Since Rules Dc and Dr can be applied in any order, and to different columns and rows, it is conceivable that different executions of Algorithm Clean could produce different results. However
17 Lemma The resulting submatrix M of M created by running Algorithm Clean on M is invariant over all executions of Algorithm Clean. Lemma Assuming there is a perfect-phylogeny with all-zero ancestral sequence for M, the Algorithm Clean reduces M to a matrix containing a single row with no entries.
18 The Third Destructive Rule Rule Dt: If neither Rule Dc nor Dr can be applied, pick a row r in the current M (other than the all-zero row that corresponds to the ancestral sequence) and remove row r from M.
19 Computing the History Bound Algorithm CHB(M) Set CLB(M) = 0. M to M. While ( M contains more than one row or contains some entries) Execute Algorithm Clean on M. Select a row r in M and remove it i.e., apply Rule Dt to M Set CLB(M) = CLB(M) + 1. End While Return CLB(M) The History Bound for M is the Minimum CLB(M) value over all possible executions of Algorithm CHB(M).
20 Example r r r r r r Figure: The input M. No application of Rule Dc or Dr is possible. So pick a row, say r 6, for Rule Dt.
21 Example r r r r r Figure: Now apply Rule Dc to column 4.
22 Example r r r r r Figure: Now apply Rule Dr, and remove row r 5.
23 Example r r r r Figure: Now apply Rule Dc twice to remove columns 1 and 6.
24 Example r r r r Figure: Now apply Rule Dt to remove row 4.
25 Example r r r Figure: Now apply Rule Dt again to remove row 3.
26 Example r r Figure: Now apply Rule Dc twice to remove columns 2 and 5.
27 Example 3 r 1 1 r 2 1 Figure: Now apply Rule Dr to remove row 2.
28 Example 3 r 1 1 Figure: Now apply Rule Dc to remove column 3 to obtain a single row with no entries.
29 A graphical view of Algorithm CHB and the History Bound Consider an ARG N for M, and an execution of Algorithm CHB. Each application of a destructive rule Dc, Dr or Dt removes a column or row from M. We will define ARG destruction rules to reduce N in parallel with the execution of Algorithm CHB. We let Ñ denote the remaining portion of N, and use M, as before, to denote the current matrix derived from M.
30 ARG destruction Rules and Facts: 1. When Rule Dc removes a column c, remove label c on an edge into a leaf. 2. When Rule Dr removes a row in M with sequence s, remove one of two sibling leaves, each of which is labeled with sequence s. 3. Fact: There is an execution of Algorithm CHB where each application of Rule Dt removes a row in M whose sequence s labels a recombination node x in Ñ, and also labels the leaf-child of x. 4. When Rule Dt removes sequence s from M, modify the current Ñ by removing the leaf labeled s and then successively removing edges that are ancestral to the leaf, following any such path backwards until the path reaches a node that has out-degree at least two.
31 Each time we modify Ñ, we we also contract any resulting node incident with one in and one out edge. Lemma Using the destructive rules, during an execution of Algorithm CHB, each ARG Ñ will be an ARG that generates the sequences in the corresponding matrix M. Lemma When M is a single row with no sites, Ñ will only consist of the root node of N.
32 Example S P P S P S Figure: The original ARG N
33 Example S P P 3 S Figure: ARG Ñ after the first application of Rule Dt.
34 Example S P P 3 S Figure: ARG Ñ after application of Rule Dc, removing edge label 6.
35 Example Figure: ARG Ñ after the second application of Rule Dt. That application of Rule Dt also removes the recombination node labeled (originally ). That node is rec-invisible. Next, Algorithm Clean will remove all entries of M, and the remaining tree will be reduced to the root node.
36 Now we return to invisible nodes Definition A recombination node x in an ARG N is called rec-visible if there is some path from v to a leaf of N that does not contain a recombination node other than x. Otherwise it is called rec-invisible, i.e., every path from v to a leaf encounters another recombination node. Also called normal in the cluster-model literature. (S. Willson) Theorem The History-Bound for M will be strictly less than Rmin 0 (M) if there is a MinARG N for M containing a rec-invisible recombination node x. The proof involves showing that the phenomena that happened in the example, always happens if there is a rec-invisible node in any MinARG for N.
37 Normal Cluster Networks Many nice things happen when the data (splits) comes from Normal or Regular or Tree-Child networks. (Baroni, Semple, Steel, Willson, Kelk, van Iersel, Valiente, Nakheleh and more.) Bad things happen sometimes when data comes from non-normal data (Song, Kelk, van Iersel.)
38 Stating Theorem?? differently, in order for the History-Bound to have a chance to be equal to Rmin 0 (M), no MinARG N for M can have a rec-invisible node. Equivalently, if the History-Bound is tight for M, then in every MinARG for M, every recombination node must be normal.
39 Corollary The difference between Rmin 0 (M) and the History-Bound for M is at least as large as the minimum number of rec-invisible recombination nodes in the MinARGs for M. The proof involves a closer look at the proof of the Theorem, so that the phenomena occurs for?? rec-invisible node in any MinARG for M. This suggests that the history bound is generally weak for the cluster-based model. It can be increased in the ARG model by use of the composite method, but that does not work for the cluster model. The graphical interpretation also allows us to establish a lower bound on the Hisotry-Bound.
40 Definition A recombination node v in an ARG is called hyper-visible if no path from v reaches another recombination node. Theorem The History-Bound for M is at least as large as the minimum number of hyper-visible recombination nodes over all the ARGs for M. Corollary The History-Bound is tight for M if every ARG for M has at least Rmin 0 hyper-visible recombination nodes. This can happen only if in every MinARG for M, every recombination node is hyper-visible.
41 Result 2. A NASC for no Steiner nodes Definition A Steiner node in an ARG N is a node that is labeled with a sequence not in the input set M. Steiner nodes are common, but why? When do we need them? Definition Let D r (M) and D c (M) be the number of distinct rows and distinct columns of M, respectively. The Haplotype Lower Bound on M, denoted H(M), is defined to be D r (M) D c (M) 1. Theorem If M is generated on an ARG N whose root sequence is not in M, then the number of recombination nodes in N must be at least the D r (M) D c (M) = H(M) + 1.
42 For simplicity, we will assume that the root sequence is specified and is in M. Close examination of the proof of Theorem?? leads to the following: Theorem If N is an ARG for M that has only H(M) recombination nodes (and hence is a MinARG), and every site in M is distinct, then N has no Steiner nodes, and no edge is labeled with more than one mutation. Theorem If M is derived on an ARG with no Steiner nodes and at most one mutation per edge, then N contains exactly H(M) recombination nodes. Hence N is a MinARG for M. Moreover, no MinARG for M with one mutation per edge can have any Steiner nodes.
43 Since the Haplotype Bound is always less than or equal to the History Bound, we have Theorem M can be derived on an ARG with no Steiner nodes and one mutation per edge only if every MinARG for M (with one mutation per edge) has no rec-invisible recombination nodes.
44 Topic 3 Definition We define the incompatibility graph G(M) for M as the graph containing one node for each site in M, and an edge connecting two nodes c and d if and only if sites c and d are incompatible, i.e. contain all four binary pairs 0,0; 0,1; 1,0; 1,1. Definition A connected component C of a graph is a maximal subgraph such that for any pair of nodes (u,v) in C there is at least one path between u and v in the subgraph C. A trivial component has only one node, and no edges.
45 a: b: P S d: c: e: M a: b: c: d: e: f: g: P Incompatibility graph for M S g:00101 f: Figure: Example
46 Theorem Let G(M) be the incompatibility graph for the set of sequences M. Then there is an ARG N that derives M, where every blob in N contains all and only the sites of a single non-trivial connected component of G(M), and every compatible site is on a cut-edge of N. Definition An ARG N is called fully-decomposed if it has the structure specified above.
47 , 5 S P S 5 P S P P 4 S Figure: Example
48 S P P S P S Figure: ARG that is not fully decomposed, for the same sequence.
49 visibility Theorem Let N be an ARG for M where all the nodes in N are visible. Then there is a fully-decomposed ARG for M with the same number of recombination nodes as N. Corollary There is no fully-decomposed MinARG for M only if every MinARG for M has at least one Steiner node.
50 Next we establish a relationship between the haplotype bound for M, H(M), and full-decomposition. Theorem Assume that the set of sequences M contains no duplicate sites. If the haplotype bound is tight for M, then there is a fully-decomposed MinARG for M. Here we state the most general result established for the existence of a fully-decomposed MinARG. Theorem There is a fully-decomposed MinARG for M if there is a MinARG N such that incompatibility graphs G(M) and G(L N ) have the same number of connected components.
Integer Programming for Phylogenetic Network Problems
Integer Programming for Phylogenetic Network Problems D. Gusfield University of California, Davis Presented at the National University of Singapore, July 27, 2015.! There are many important phylogeny problems
More informationA Phylogenetic Network Construction due to Constrained Recombination
A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer
More informationInteger Programming in Computational Biology. D. Gusfield University of California, Davis Presented December 12, 2016.!
Integer Programming in Computational Biology D. Gusfield University of California, Davis Presented December 12, 2016. There are many important phylogeny problems that depart from simple tree models: Missing
More informationHaplotyping as Perfect Phylogeny: A direct approach
Haplotyping as Perfect Phylogeny: A direct approach Vineet Bafna Dan Gusfield Giuseppe Lancia Shibu Yooseph February 7, 2003 Abstract A full Haplotype Map of the human genome will prove extremely valuable
More informationProperties of normal phylogenetic networks
Properties of normal phylogenetic networks Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu August 13, 2009 Abstract. A phylogenetic network is
More informationTree-average distances on certain phylogenetic networks have their weights uniquely determined
Tree-average distances on certain phylogenetic networks have their weights uniquely determined Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu
More informationReconstruction of certain phylogenetic networks from their tree-average distances
Reconstruction of certain phylogenetic networks from their tree-average distances Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu October 10,
More informationNOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS
NOTE ON THE HYBRIDIZATION NUMBER AND SUBTREE DISTANCE IN PHYLOGENETICS PETER J. HUMPHRIES AND CHARLES SEMPLE Abstract. For two rooted phylogenetic trees T and T, the rooted subtree prune and regraft distance
More informationPhylogenetic Networks, Trees, and Clusters
Phylogenetic Networks, Trees, and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science Rice University Houston, TX 77005, USA nakhleh@cs.rice.edu 2 Department of Biology University
More information1.1 The (rooted, binary-character) Perfect-Phylogeny Problem
Contents 1 Trees First 3 1.1 Rooted Perfect-Phylogeny...................... 3 1.1.1 Alternative Definitions.................... 5 1.1.2 The Perfect-Phylogeny Problem and Solution....... 7 1.2 Alternate,
More informationAphylogenetic network is a generalization of a phylogenetic tree, allowing properties that are not tree-like.
INFORMS Journal on Computing Vol. 16, No. 4, Fall 2004, pp. 459 469 issn 0899-1499 eissn 1526-5528 04 1604 0459 informs doi 10.1287/ijoc.1040.0099 2004 INFORMS The Fine Structure of Galls in Phylogenetic
More informationBeyond Galled Trees Decomposition and Computation of Galled Networks
Beyond Galled Trees Decomposition and Computation of Galled Networks Daniel H. Huson & Tobias H.Kloepper RECOMB 2007 1 Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or
More informationEstimating Recombination Rates. LRH selection test, and recombination
Estimating Recombination Rates LRH selection test, and recombination Recall that LRH tests for selection by looking at frequencies of specific haplotypes. Clearly the test is dependent on the recombination
More informationTHE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT
COMMUNICATIONS IN INFORMATION AND SYSTEMS c 2009 International Press Vol. 9, No. 4, pp. 295-302, 2009 001 THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT DAN GUSFIELD AND YUFENG WU Abstract.
More informationImproved maximum parsimony models for phylogenetic networks
Improved maximum parsimony models for phylogenetic networks Leo van Iersel Mark Jones Celine Scornavacca December 20, 207 Abstract Phylogenetic networks are well suited to represent evolutionary histories
More informationCounting All Possible Ancestral Configurations of Sample Sequences in Population Genetics
1 Counting All Possible Ancestral Configurations of Sample Sequences in Population Genetics Yun S. Song, Rune Lyngsø, and Jotun Hein Abstract Given a set D of input sequences, a genealogy for D can be
More informationFinding a gene tree in a phylogenetic network Philippe Gambette
LRI-LIX BioInfo Seminar 19/01/2017 - Palaiseau Finding a gene tree in a phylogenetic network Philippe Gambette Outline Phylogenetic networks Classes of phylogenetic networks The Tree Containment Problem
More informationX X (2) X Pr(X = x θ) (3)
Notes for 848 lecture 6: A ML basis for compatibility and parsimony Notation θ Θ (1) Θ is the space of all possible trees (and model parameters) θ is a point in the parameter space = a particular tree
More informationMathematical Approaches to the Pure Parsimony Problem
Mathematical Approaches to the Pure Parsimony Problem P. Blain a,, A. Holder b,, J. Silva c, and C. Vinzant d, July 29, 2005 Abstract Given the genetic information of a population, the Pure Parsimony problem
More informationRegular networks are determined by their trees
Regular networks are determined by their trees Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu February 17, 2009 Abstract. A rooted acyclic digraph
More informationarxiv: v3 [q-bio.pe] 1 May 2014
ON COMPUTING THE MAXIMUM PARSIMONY SCORE OF A PHYLOGENETIC NETWORK MAREIKE FISCHER, LEO VAN IERSEL, STEVEN KELK, AND CELINE SCORNAVACCA arxiv:32.243v3 [q-bio.pe] May 24 Abstract. Phylogenetic networks
More informationarxiv: v5 [q-bio.pe] 24 Oct 2016
On the Quirks of Maximum Parsimony and Likelihood on Phylogenetic Networks Christopher Bryant a, Mareike Fischer b, Simone Linz c, Charles Semple d arxiv:1505.06898v5 [q-bio.pe] 24 Oct 2016 a Statistics
More informationLet S be a set of n species. A phylogeny is a rooted tree with n leaves, each of which is uniquely
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 8, Number 1, 2001 Mary Ann Liebert, Inc. Pp. 69 78 Perfect Phylogenetic Networks with Recombination LUSHENG WANG, 1 KAIZHONG ZHANG, 2 and LOUXIN ZHANG 3 ABSTRACT
More informationA new algorithm to construct phylogenetic networks from trees
A new algorithm to construct phylogenetic networks from trees J. Wang College of Computer Science, Inner Mongolia University, Hohhot, Inner Mongolia, China Corresponding author: J. Wang E-mail: wangjuanangle@hit.edu.cn
More informationIntraspecific gene genealogies: trees grafting into networks
Intraspecific gene genealogies: trees grafting into networks by David Posada & Keith A. Crandall Kessy Abarenkov Tartu, 2004 Article describes: Population genetics principles Intraspecific genetic variation
More informationOn improving matchings in trees, via bounded-length augmentations 1
On improving matchings in trees, via bounded-length augmentations 1 Julien Bensmail a, Valentin Garnero a, Nicolas Nisse a a Université Côte d Azur, CNRS, Inria, I3S, France Abstract Due to a classical
More informationNotes on the Matrix-Tree theorem and Cayley s tree enumerator
Notes on the Matrix-Tree theorem and Cayley s tree enumerator 1 Cayley s tree enumerator Recall that the degree of a vertex in a tree (or in any graph) is the number of edges emanating from it We will
More informationReconstructing Phylogenetic Networks
Reconstructing Phylogenetic Networks Mareike Fischer, Leo van Iersel, Steven Kelk, Nela Lekić, Simone Linz, Celine Scornavacca, Leen Stougie Centrum Wiskunde & Informatica (CWI) Amsterdam MCW Prague, 3
More informationFrom graph classes to phylogenetic networks Philippe Gambette
40 années d'algorithmique de graphes 40 Years of Graphs and Algorithms 11/10/2018 - Paris From graph classes to phylogenetic networks Philippe Gambette Outline Discovering graph classes with Michel An
More informationLearning ancestral genetic processes using nonparametric Bayesian models
Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew
More informationAn Overview of Combinatorial Methods for Haplotype Inference
An Overview of Combinatorial Methods for Haplotype Inference Dan Gusfield 1 Department of Computer Science, University of California, Davis Davis, CA. 95616 Abstract A current high-priority phase of human
More informationAUTHORIZATION TO LEND AND REPRODUCE THE THESIS. Date Jong Wha Joanne Joo, Author
AUTHORIZATION TO LEND AND REPRODUCE THE THESIS As the sole author of this thesis, I authorize Brown University to lend it to other institutions or individuals for the purpose of scholarly research. Date
More informationThe Pure Parsimony Problem
Haplotyping and Minimum Diversity Graphs Courtney Davis - University of Utah - Trinity University Some Genetics Mother Paired Gene Representation Physical Trait ABABBA AAABBB Physical Trait ABA AAA Mother
More informationarxiv: v1 [cs.cc] 9 Oct 2014
Satisfying ternary permutation constraints by multiple linear orders or phylogenetic trees Leo van Iersel, Steven Kelk, Nela Lekić, Simone Linz May 7, 08 arxiv:40.7v [cs.cc] 9 Oct 04 Abstract A ternary
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities
More informationMathematical models in population genetics II
Mathematical models in population genetics II Anand Bhaskar Evolutionary Biology and Theory of Computing Bootcamp January 1, 014 Quick recap Large discrete-time randomly mating Wright-Fisher population
More informationThe genomes of recombinant inbred lines
The genomes of recombinant inbred lines Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman C57BL/6 2 1 Recombinant inbred lines (by sibling mating)
More informationarxiv: v1 [q-bio.pe] 1 Jun 2014
THE MOST PARSIMONIOUS TREE FOR RANDOM DATA MAREIKE FISCHER, MICHELLE GALLA, LINA HERBST AND MIKE STEEL arxiv:46.27v [q-bio.pe] Jun 24 Abstract. Applying a method to reconstruct a phylogenetic tree from
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm
More informationAcyclic Digraphs arising from Complete Intersections
Acyclic Digraphs arising from Complete Intersections Walter D. Morris, Jr. George Mason University wmorris@gmu.edu July 8, 2016 Abstract We call a directed acyclic graph a CI-digraph if a certain affine
More informationRECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION
RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION MAGNUS BORDEWICH, KATHARINA T. HUBER, VINCENT MOULTON, AND CHARLES SEMPLE Abstract. Phylogenetic networks are a type of leaf-labelled,
More informationReconstructing Trees from Subtree Weights
Reconstructing Trees from Subtree Weights Lior Pachter David E Speyer October 7, 2003 Abstract The tree-metric theorem provides a necessary and sufficient condition for a dissimilarity matrix to be a tree
More informationAn introduction to phylogenetic networks
An introduction to phylogenetic networks Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University Email: steven.kelk@maastrichtuniversity.nl Web: http://skelk.sdf-eu.org Genome sequence,
More informationHomework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:
Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships
More informationCS1820 Notes. hgupta1, kjline, smechery. April 3-April 5. output: plausible Ancestral Recombination Graph (ARG)
CS1820 Notes hgupta1, kjline, smechery April 3-April 5 April 3 Notes 1 Minichiello-Durbin Algorithm input: set of sequences output: plausible Ancestral Recombination Graph (ARG) note: the optimal ARG is
More informationMarkov properties for directed graphs
Graphical Models, Lecture 7, Michaelmas Term 2009 November 2, 2009 Definitions Structural relations among Markov properties Factorization G = (V, E) simple undirected graph; σ Say σ satisfies (P) the pairwise
More informationProcesses of Evolution
15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection
More informationCopyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation
Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published
More informationCherry picking: a characterization of the temporal hybridization number for a set of phylogenies
Bulletin of Mathematical Biology manuscript No. (will be inserted by the editor) Cherry picking: a characterization of the temporal hybridization number for a set of phylogenies Peter J. Humphries Simone
More informationAmira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationCell division and multiplication
CELL DIVISION Cell division and multiplication As we already mentioned, the genetic information contained in the nucleus is hereditary Meaning it is passed on from cell to cell; from parent to child This
More informationAn Algebraic View of the Relation between Largest Common Subtrees and Smallest Common Supertrees
An Algebraic View of the Relation between Largest Common Subtrees and Smallest Common Supertrees Francesc Rosselló 1, Gabriel Valiente 2 1 Department of Mathematics and Computer Science, Research Institute
More informationSCIENCE M E I O S I S
SCIENCE 9 6. 1 - M E I O S I S OBJECTIVES By the end of the lesson you should be able to: Describe the process of meiosis Compare and contrast meiosis and mitosis Explain why meiosis is needed MEIOSIS
More informationLecture 5 January 16, 2013
UBC CPSC 536N: Sparse Approximations Winter 2013 Prof. Nick Harvey Lecture 5 January 16, 2013 Scribe: Samira Samadi 1 Combinatorial IPs 1.1 Mathematical programs { min c Linear Program (LP): T x s.t. a
More informationPopulations in statistical genetics
Populations in statistical genetics What are they, and how can we infer them from whole genome data? Daniel Lawson Heilbronn Institute, University of Bristol www.paintmychromosomes.com Work with: January
More informationACYCLIC DIGRAPHS GIVING RISE TO COMPLETE INTERSECTIONS
ACYCLIC DIGRAPHS GIVING RISE TO COMPLETE INTERSECTIONS WALTER D. MORRIS, JR. ABSTRACT. We call a directed acyclic graph a CIdigraph if a certain affine semigroup ring defined by it is a complete intersection.
More informationAllen Holder - Trinity University
Haplotyping - Trinity University Population Problems - joint with Courtney Davis, University of Utah Single Individuals - joint with John Louie, Carrol College, and Lena Sherbakov, Williams University
More informationTree sets. Reinhard Diestel
1 Tree sets Reinhard Diestel Abstract We study an abstract notion of tree structure which generalizes treedecompositions of graphs and matroids. Unlike tree-decompositions, which are too closely linked
More informationA CLUSTER REDUCTION FOR COMPUTING THE SUBTREE DISTANCE BETWEEN PHYLOGENIES
A CLUSTER REDUCTION FOR COMPUTING THE SUBTREE DISTANCE BETWEEN PHYLOGENIES SIMONE LINZ AND CHARLES SEMPLE Abstract. Calculating the rooted subtree prune and regraft (rspr) distance between two rooted binary
More informationHaploid & diploid recombination and their evolutionary impact
Haploid & diploid recombination and their evolutionary impact W. Garrett Mitchener College of Charleston Mathematics Department MitchenerG@cofc.edu http://mitchenerg.people.cofc.edu Introduction The basis
More informationThe Multi-State Perfect Phylogeny Problem with Missing and Removable Data: Solutions via Integer-Programming and Chordal Graph Theory
The Multi-State Perfect Phylogeny Problem with Missing and Removable Data: Solutions via Integer-Programming and Chordal Graph Theory Dan Gusfield Department of Computer Science, University of California,
More informationarxiv: v1 [cs.ds] 21 May 2013
Easy identification of generalized common nested intervals Fabien de Montgolfier 1, Mathieu Raffinot 1, and Irena Rusu 2 arxiv:1305.4747v1 [cs.ds] 21 May 2013 1 LIAFA, Univ. Paris Diderot - Paris 7, 75205
More informationHaplotype Inference Constrained by Plausible Haplotype Data
Haplotype Inference Constrained by Plausible Haplotype Data Michael R. Fellows 1, Tzvika Hartman 2, Danny Hermelin 3, Gad M. Landau 3,4, Frances Rosamond 1, and Liat Rozenberg 3 1 The University of Newcastle,
More informationRestricted trees: simplifying networks with bottlenecks
Restricted trees: simplifying networks with bottlenecks Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu February 17, 2011 Abstract. Suppose N
More informationPart V. Matchings. Matching. 19 Augmenting Paths for Matchings. 18 Bipartite Matching via Flows
Matching Input: undirected graph G = (V, E). M E is a matching if each node appears in at most one Part V edge in M. Maximum Matching: find a matching of maximum cardinality Matchings Ernst Mayr, Harald
More informationarxiv: v4 [q-bio.pe] 7 Jul 2016
Complexity and algorithms for finding a perfect phylogeny from mixed tumor samples Ademir Hujdurović a,b Urša Kačar c Martin Milanič a,b Bernard Ries d Alexandru I. Tomescu e arxiv:1506.07675v4 [q-bio.pe]
More informationSolutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin
Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin CHAPTER 1 1.2 The expected homozygosity, given allele
More informationExact and Approximate Equilibria for Optimal Group Network Formation
Exact and Approximate Equilibria for Optimal Group Network Formation Elliot Anshelevich and Bugra Caskurlu Computer Science Department, RPI, 110 8th Street, Troy, NY 12180 {eanshel,caskub}@cs.rpi.edu Abstract.
More informationPage 1. Evolutionary Trees. Why build evolutionary tree? Outline
Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istance-based vs. character-based methods. istance-based: Ultrametric Trees dditive Trees. haracter-based: Perfect phylogeny
More informationDr. Amira A. AL-Hosary
Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological
More informationGraph coloring, perfect graphs
Lecture 5 (05.04.2013) Graph coloring, perfect graphs Scribe: Tomasz Kociumaka Lecturer: Marcin Pilipczuk 1 Introduction to graph coloring Definition 1. Let G be a simple undirected graph and k a positive
More informationCREATING PHYLOGENETIC TREES FROM DNA SEQUENCES
INTRODUCTION CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES This worksheet complements the Click and Learn developed in conjunction with the 2011 Holiday Lectures on Science, Bones, Stones, and Genes:
More informationShortest paths with negative lengths
Chapter 8 Shortest paths with negative lengths In this chapter we give a linear-space, nearly linear-time algorithm that, given a directed planar graph G with real positive and negative lengths, but no
More informationConnectivity and tree structure in finite graphs arxiv: v5 [math.co] 1 Sep 2014
Connectivity and tree structure in finite graphs arxiv:1105.1611v5 [math.co] 1 Sep 2014 J. Carmesin R. Diestel F. Hundertmark M. Stein 20 March, 2013 Abstract Considering systems of separations in a graph
More informationMajor questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.
Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary
More informationLinear Algebra II. 2 Matrices. Notes 2 21st October Matrix algebra
MTH6140 Linear Algebra II Notes 2 21st October 2010 2 Matrices You have certainly seen matrices before; indeed, we met some in the first chapter of the notes Here we revise matrix algebra, consider row
More informationInformation Theory and Statistics Lecture 2: Source coding
Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection
More informationExact Algorithms for Dominating Induced Matching Based on Graph Partition
Exact Algorithms for Dominating Induced Matching Based on Graph Partition Mingyu Xiao School of Computer Science and Engineering University of Electronic Science and Technology of China Chengdu 611731,
More informationRealization Plans for Extensive Form Games without Perfect Recall
Realization Plans for Extensive Form Games without Perfect Recall Richard E. Stearns Department of Computer Science University at Albany - SUNY Albany, NY 12222 April 13, 2015 Abstract Given a game in
More informationAGENDA Go Over DUT; offer REDO opportunity Notes on Intro to Evolution Cartoon Activity
Date: Number your notebook and label the top the following: EVEN Pages-LEFT SIDE Page 176- Concept Map Page 178- Sequence Page 180- Vocabulary Page 182- Warm Ups Page 184- Cartoon Questions HN- Natural
More informationPhylogenetic networks: overview, subclasses and counting problems Philippe Gambette
ANR-FWF-MOST meeting 2018-10-30 - Wien Phylogenetic networks: overview, subclasses and counting problems Philippe Gambette Outline An introduction to phylogenetic networks Classes of phylogenetic networks
More information1 Efficient Transformation to CNF Formulas
1 Efficient Transformation to CNF Formulas We discuss an algorithm, due to Tseitin [?], which efficiently transforms an arbitrary Boolean formula φ to a CNF formula ψ such that ψ has a model if and only
More informationWeek 4. (1) 0 f ij u ij.
Week 4 1 Network Flow Chapter 7 of the book is about optimisation problems on networks. Section 7.1 gives a quick introduction to the definitions of graph theory. In fact I hope these are already known
More informationUnit 4 Review - Genetics. UNIT 4 Vocabulary topics: Cell Reproduction, Cell Cycle, Cell Division, Genetics
Unit 4 Review - Genetics Sexual vs. Asexual Reproduction Mendel s Laws of Heredity Patterns of Inheritance Meiosis and Genetic Variation Non-Mendelian Patterns of Inheritance Cell Reproduction/Cell Cycle/
More informationThe Lander-Green Algorithm. Biostatistics 666 Lecture 22
The Lander-Green Algorithm Biostatistics 666 Lecture Last Lecture Relationship Inferrence Likelihood of genotype data Adapt calculation to different relationships Siblings Half-Siblings Unrelated individuals
More informationFamily Trees for all grades. Learning Objectives. Materials, Resources, and Preparation
page 2 Page 2 2 Introduction Family Trees for all grades Goals Discover Darwin all over Pittsburgh in 2009 with Darwin 2009: Exploration is Never Extinct. Lesson plans, including this one, are available
More informationCographs; chordal graphs and tree decompositions
Cographs; chordal graphs and tree decompositions Zdeněk Dvořák September 14, 2015 Let us now proceed with some more interesting graph classes closed on induced subgraphs. 1 Cographs The class of cographs
More informationGenetic Engineering and Creative Design
Genetic Engineering and Creative Design Background genes, genotype, phenotype, fitness Connecting genes to performance in fitness Emergent gene clusters evolved genes MIT Class 4.208 Spring 2002 Evolution
More informationEnumeration and symmetry of edit metric spaces. Jessie Katherine Campbell. A dissertation submitted to the graduate faculty
Enumeration and symmetry of edit metric spaces by Jessie Katherine Campbell A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY
More informationA PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS
A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS CRYSTAL L. KAHN and BENJAMIN J. RAPHAEL Box 1910, Brown University Department of Computer Science & Center for Computational Molecular Biology
More informationSIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding
SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.
More information27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1
10-708: Probabilistic Graphical Models, Spring 2015 27: Case study with popular GM III Lecturer: Eric P. Xing Scribes: Hyun Ah Song & Elizabeth Silver 1 Introduction: Gene association mapping for complex
More informationOn the Subnet Prune and Regraft distance
On the Subnet Prune and Regraft distance Jonathan Klawitter and Simone Linz Department of Computer Science, University of Auckland, New Zealand jo. klawitter@ gmail. com, s. linz@ auckland. ac. nz arxiv:805.07839v
More informationThe Gauss-Jordan Elimination Algorithm
The Gauss-Jordan Elimination Algorithm Solving Systems of Real Linear Equations A. Havens Department of Mathematics University of Massachusetts, Amherst January 24, 2018 Outline 1 Definitions Echelon Forms
More informationFamily Trees for all grades. Learning Objectives. Materials, Resources, and Preparation
page 2 Page 2 2 Introduction Family Trees for all grades Goals Discover Darwin all over Pittsburgh in 2009 with Darwin 2009: Exploration is Never Extinct. Lesson plans, including this one, are available
More informationThe Inflation Technique for Causal Inference with Latent Variables
The Inflation Technique for Causal Inference with Latent Variables arxiv:1609.00672 (Elie Wolfe, Robert W. Spekkens, Tobias Fritz) September 2016 Introduction Given some correlations between the vocabulary
More informationMaximising the number of induced cycles in a graph
Maximising the number of induced cycles in a graph Natasha Morrison Alex Scott April 12, 2017 Abstract We determine the maximum number of induced cycles that can be contained in a graph on n n 0 vertices,
More informationSolving the Maximum Agreement Subtree and Maximum Comp. Tree problems on bounded degree trees. Sylvain Guillemot, François Nicolas.
Solving the Maximum Agreement Subtree and Maximum Compatible Tree problems on bounded degree trees LIRMM, Montpellier France 4th July 2006 Introduction The Mast and Mct problems: given a set of evolutionary
More informationEvolutionary Tree Analysis. Overview
CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based
More informationMatroid Secretary for Regular and Decomposable Matroids
Matroid Secretary for Regular and Decomposable Matroids Michael Dinitz Weizmann Institute of Science mdinitz@cs.cmu.edu Guy Kortsarz Rutgers University, Camden guyk@camden.rutgers.edu Abstract In the matroid
More information