Identifiability of the GTR+Γ substitution model (and other models) of DNA evolution
|
|
- Melvin Ball
- 6 years ago
- Views:
Transcription
1 Identifiability of the GTR+Γ substitution model (and other models) of DNA evolution Elizabeth S. Allman Dept. of Mathematics and Statistics University of Alaska Fairbanks TM Current Challenges and Problems in Phylogenetics Isaac Newton Institute Cambridge, England 5 September 2007
2 Jointworkwith J. Rhodes C. Ané
3 Identifiability. A model of molecular evolution M is identifiable if the values of all parameters can be determined from the joint distribution P of states. Parameters = tree topology(ies), stationary distribution, edge lengths, rate matrix Q, Γ shape parameter, Markov edge matrices, p inv,etc. Identifiability is necessary to have consistency of statistical inference, whether using ML or Bayesian methods. INI Identifiability Slide 1
4 Known Identifiability results... Negative: For sufficiently complicated rate-across-sites models (non-explicit), tree identifiability can fail (Steel-Székely-Hendy, J. Comp. Biol., 1994) explicit non-generic examples (not r-a-s) of non-identifiability of mixtures (Štefankovič-Vigoda, Sys. Biol., 2007; J. Comp. Biol., 2007) non-generic 2-class mixtures on one tree can exactly agree with 1-class model on different tree (Matsen-Steel, preprint) more general study of many-class non-identifiable mixtures under 2-state symmetric model (Matsen-Mossel-Steel, preprint) INI Identifiability Slide 2
5 Positive: GTR is identifiable (use log-det distance to identify tree, etc.) GM is identifiable (Chang, Math. Biosci., 1996) general result on mixture models on one tree with small number of classes (Allman-Rhodes, J. Comp. Biol., 2006) For DNA models, tree is generically identifiable for: GTR+I GTR with 3 rate-across-sites classes GTR+GTR+GTR GM+GM+GM covarion with 3 rate classes INI Identifiability Slide 3
6 Generic vs. non-generic identifiability. If T n denotes n-leaf tree space and M any choice of model, then the parameterization map(s) φ M : T T n (T,S T ) C κ (T,s T ) P = φ M,T (s T ) give rise to the collection of joint distributions P for M. M is identifiable φ M is injective n INI Identifiability Slide 4
7 For a fixed tree T,themap φ M,T : {Parameters on T } C κn s T P = φ M,T (s T ) associates to each tree T its phylogenetic variety V T. But, V T1 V T2 always (star phylogenies) If the intersection is of lower dimension, then the tree is identifiable for generic parameters. INI Identifiability Slide 5
8 For a fixed tree T,themap φ M,T : {Parameters on T } C κn s T P = φ M,T (s T ) associates to each tree T its phylogenetic variety V T. V T1 But, V T1 V T2 always (star phylogenies) V T2 If the intersection is of lower dimension, then the tree is identifiable for generic parameters. INI Identifiability Slide 6
9 Today... Q1: Is the GTR+Γ+I model identifiable? Q2: Are 2-tree mixtures identifiable? INI Identifiability Slide 7
10 Q1: Is the GTR+Γ+I model identifiable? Rogers (Sys. Biol., 2001) claimed a proof, widely cited, but Argument has several major gaps in showing identifiability: 1) crucial use of an unjustified graphical claim 2) generic vs. non-generic parameters There is no valid, published proof that ML or Bayesian inference using the GTR+Γ+I model is consistent. INI Identifiability Slide 8
11 None of previous work applies to GTR+Γ or GTR+Γ+I, since: continuous rate distribution prevents application of Allman-Rhodes positive results (or algebraic methods of proof) specifying a particular form of rate distribution prevents application of negative Steel or Matsen-Mossel-Steel results. INI Identifiability Slide 9
12 New result: Allman, Ané, Rhodes (2007): For 4-state (DNA) models, GTR+Γ is identifiable. And, more generally, For κ-state models, GTR+Γ is generically identifiable. Comments: This is the first proof of identifiability for a rate-across-sites model with a continuous distribution of rates. Identifiability for all parameters, not just generic ones. Proof does not follow Rogers approach. INI Identifiability Slide 10
13 Main points of GTR+Γ proof: stationary distribution, eigenvectors of rate matrix Q from 1- and 2-taxon marginals Focus on 3-leaf tree to identify α shape parameter (work) a 2 a 3 a 1 then get Q, edge lengths t e. Result for n-leaf tree then follows from combinatorial arguments. Use algebraic arguments to extract information from 3-dim tensor. Use analytic arguments (convexity) for generic identifiability. Detailed analysis of non-generic cases completes proof. INI Identifiability Slide 11
14 Note: We still lack a proof that the tree is identifiable for GTR+Γ+I. This is likely to be significantly harder to prove since: Γ introduces only 1 parameter (shape parameter α ), Γ+I introduces 2 parameters (α, proportion of invar. sites p inv ) INI Identifiability Slide 12
15 Tree mixtures. Different parts of sequences may have evolved along different trees gene tree vs. species tree, incomplete lineage sorting Species Tree Gene 1 Gene 2 horizontalgenetransfer INI Identifiability Slide 13
16 Two-tree mixtures can confound analysis. Mossel E. and Vigoda E., Phylogenetic MCMC algorithms are misleading on mixtures of trees, Science 309, 2207 (2005). Ronquist, F., Larget B., Huelsenbeck, J., Kadane J., Simon D., and van der Mark, P., Comment on Phylogenetic MCMC algorithms are misleading on mixtures of trees, Science 312, 367a (2006). Mossel E. and Vigoda E., Response to comment on Phylogenetic MCMC algorithms are misleading on mixtures of trees, Science 312, 367b (2006). Matsen, F. and Steel M., Phylogenetic mixtures on a single tree can mimic a tree of another topology, preprint. (2-state) Matsen, F., Mossel, E. and Steel M., Mixed-up trees: the structure of phylogenetic mixtures, preprint. (2-state) INI Identifiability Slide 14
17 Simple model: 4-taxon trees T 1, T 2, T 3 a c a b a b b d c d d c T 1 T 2 T 3 Joint distributions P 1,2 are 2-tree mixtures with δ a mixing parameter. P 1,2 = δp M,T1 +(1 δ)p M,T2 Similarly, for the other two mixtures. INI Identifiability Slide 15
18 Theorem. Suppose P ij is a joint distribution arising from a 2-tree GM mixture on 4-taxon trees for κ =4states. Then the trees T i, T j and stochastic parameters s i, s j are generically identifiable from P ij. i.e. given P ij, we can generically identify (T i,s i ) and (T j,s j ). A similar result holds for 2-tree GTR mixtures (and JC mixtures). INI Identifiability Slide 16
19 Two-tree mixtures proof. (GM) Find a specific point B that lies on both V GM,T1,T 2 and V GM,T1,T 3. Prove B is non-singular by computing in Maple the dimension of the tangent spaces H 1,2 to B V GM,T1,T 2 and H 1,3 to B V GM,T1,T 3. B dim(h 1,2 ) = 127, dim(h 1,3 ) = 127 INI Identifiability Slide 17
20 All computations for GM can be done exactly: B can be chosen to arise from rational parameter values. parameterization is given by polynomials with rational coefficients. Maple performs exact rational arithmetic. Another computation shows that the two tangent spaces intersect in a lower dimensional hyperplane. ( ten minutes of computation) dim(h 1,2 H 1,3 ) = 115 This proves that V GM,T1,T 2 and V GM,T1,T 3 are different, and then by principles of AG we have dim(v GM,T1,T 2 V GM,T1,T 3 ) < 127. INI Identifiability Slide 18
21 Extension to GTR (non-algebraic): Observe JC GT R. Choose B to be a Jukes-Cantor point (rational, yet GTR) with B X GT R,T1,T 2 X GT R,T1,T 3 Prove that there is a vector v tangent to X GT R,T1,T 3 not lie in the tangent plane at B to V GM,T1,T 2. at B that does INI Identifiability Slide 19
22 Preprint: INI Identifiability Slide 20
Algebraic Statistics Tutorial I
Algebraic Statistics Tutorial I Seth Sullivant North Carolina State University June 9, 2012 Seth Sullivant (NCSU) Algebraic Statistics June 9, 2012 1 / 34 Introduction to Algebraic Geometry Let R[p] =
More informationPitfalls of Heterogeneous Processes for Phylogenetic Reconstruction
Pitfalls of Heterogeneous Processes for Phylogenetic Reconstruction DANIEL ŠTEFANKOVIČ 1 AND ERIC VIGODA 2 1 Department of Computer Science, University of Rochester, Rochester, New York 14627, USA; and
More informationELIZABETH S. ALLMAN and JOHN A. RHODES ABSTRACT 1. INTRODUCTION
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 13, Number 5, 2006 Mary Ann Liebert, Inc. Pp. 1101 1113 The Identifiability of Tree Topology for Phylogenetic Models, Including Covarion and Mixture Models ELIZABETH
More informationWhen Do Phylogenetic Mixture Models Mimic Other Phylogenetic Models?
Syst. Biol. 61(6):1049 1059, 2012 The Author(s) 2012. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com
More informationPitfalls of Heterogeneous Processes for Phylogenetic Reconstruction
Pitfalls of Heterogeneous Processes for Phylogenetic Reconstruction Daniel Štefankovič Eric Vigoda June 30, 2006 Department of Computer Science, University of Rochester, Rochester, NY 14627, and Comenius
More informationPhylogeny of Mixture Models
Phylogeny of Mixture Models Daniel Štefankovič Department of Computer Science University of Rochester joint work with Eric Vigoda College of Computing Georgia Institute of Technology Outline Introduction
More informationarxiv: v1 [math.ra] 13 Jan 2009
A CONCISE PROOF OF KRUSKAL S THEOREM ON TENSOR DECOMPOSITION arxiv:0901.1796v1 [math.ra] 13 Jan 2009 JOHN A. RHODES Abstract. A theorem of J. Kruskal from 1977, motivated by a latent-class statistical
More informationPhylogenetic Algebraic Geometry
Phylogenetic Algebraic Geometry Seth Sullivant North Carolina State University January 4, 2012 Seth Sullivant (NCSU) Phylogenetic Algebraic Geometry January 4, 2012 1 / 28 Phylogenetics Problem Given a
More informationA concise proof of Kruskal s theorem on tensor decomposition
A concise proof of Kruskal s theorem on tensor decomposition John A. Rhodes 1 Department of Mathematics and Statistics University of Alaska Fairbanks PO Box 756660 Fairbanks, AK 99775 Abstract A theorem
More information1. Can we use the CFN model for morphological traits?
1. Can we use the CFN model for morphological traits? 2. Can we use something like the GTR model for morphological traits? 3. Stochastic Dollo. 4. Continuous characters. Mk models k-state variants of the
More informationThe statistical and informatics challenges posed by ascertainment biases in phylogenetic data collection
The statistical and informatics challenges posed by ascertainment biases in phylogenetic data collection Mark T. Holder and Jordan M. Koch Department of Ecology and Evolutionary Biology, University of
More informationHow should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?
How should we go about modeling this? gorilla GAAGTCCTTGAGAAATAAACTGCACACACTGG orangutan GGACTCCTTGAGAAATAAACTGCACACACTGG Model parameters? Time Substitution rate Can we observe time or subst. rate? What
More informationWorkshop III: Evolutionary Genomics
Identifying Species Trees from Gene Trees Elizabeth S. Allman University of Alaska IPAM Los Angeles, CA November 17, 2011 Workshop III: Evolutionary Genomics Collaborators The work in today s talk is joint
More informationThe Generalized Neighbor Joining method
The Generalized Neighbor Joining method Ruriko Yoshida Dept. of Mathematics Duke University Joint work with Dan Levy and Lior Pachter www.math.duke.edu/ ruriko data mining 1 Challenge We would like to
More informationRecent Progress in Combinatorial Statistics
Elchanan Mossel U.C. Berkeley Recent Progress in Combinatorial Statistics At Penn Statistics, Sep 11 Combinatorial Statistics Combinatorial Statistics : Rigorous Analysis of Inference Problems where: Estimating
More informationJed Chou. April 13, 2015
of of CS598 AGB April 13, 2015 Overview of 1 2 3 4 5 Competing Approaches of Two competing approaches to species tree inference: Summary methods: estimate a tree on each gene alignment then combine gene
More informationIntroduction to Algebraic Statistics
Introduction to Algebraic Statistics Seth Sullivant North Carolina State University January 5, 2017 Seth Sullivant (NCSU) Algebraic Statistics January 5, 2017 1 / 28 What is Algebraic Statistics? Observation
More informationPhylogenetic Inference using RevBayes
Phylogenetic Inference using RevBayes Model section using Bayes factors Sebastian Höhna 1 Overview This tutorial demonstrates some general principles of Bayesian model comparison, which is based on estimating
More informationReconstruire le passé biologique modèles, méthodes, performances, limites
Reconstruire le passé biologique modèles, méthodes, performances, limites Olivier Gascuel Centre de Bioinformatique, Biostatistique et Biologie Intégrative C3BI USR 3756 Institut Pasteur & CNRS Reconstruire
More informationLie Markov models. Jeremy Sumner. School of Physical Sciences University of Tasmania, Australia
Lie Markov models Jeremy Sumner School of Physical Sciences University of Tasmania, Australia Stochastic Modelling Meets Phylogenetics, UTAS, November 2015 Jeremy Sumner Lie Markov models 1 / 23 The theory
More informationUsing algebraic geometry for phylogenetic reconstruction
Using algebraic geometry for phylogenetic reconstruction Marta Casanellas i Rius (joint work with Jesús Fernández-Sánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya IMA
More informationThis article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing
More informationInferring Complex DNA Substitution Processes on Phylogenies Using Uniformization and Data Augmentation
Syst Biol 55(2):259 269, 2006 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 101080/10635150500541599 Inferring Complex DNA Substitution Processes on Phylogenies
More informationWho was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?
Who was Bayes? Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 The Reverand Thomas Bayes was born in London in 1702. He was the
More informationBayesian Phylogenetics
Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 Bayesian Phylogenetics 1 / 27 Who was Bayes? The Reverand Thomas Bayes was born
More informationExample: Hardy-Weinberg Equilibrium. Algebraic Statistics Tutorial I. Phylogenetics. Main Point of This Tutorial. Model-Based Phylogenetics
Example: Hardy-Weinberg Equilibrium Algebraic Statistics Tutorial I Seth Sullivant North Carolina State University July 22, 2012 Suppose a gene has two alleles, a and A. If allele a occurs in the population
More informationarxiv: v1 [math.st] 22 Jun 2018
Hypothesis testing near singularities and boundaries arxiv:1806.08458v1 [math.st] Jun 018 Jonathan D. Mitchell, Elizabeth S. Allman, and John A. Rhodes Department of Mathematics & Statistics University
More informationMixed-up Trees: the Structure of Phylogenetic Mixtures
Bulletin of Mathematical Biology (2008) 70: 1115 1139 DOI 10.1007/s11538-007-9293-y ORIGINAL ARTICLE Mixed-up Trees: the Structure of Phylogenetic Mixtures Frederick A. Matsen a,, Elchanan Mossel b, Mike
More informationfirst (i.e., weaker) sense of the term, using a variety of algorithmic approaches. For example, some methods (e.g., *BEAST 20) co-estimate gene trees
Concatenation Analyses in the Presence of Incomplete Lineage Sorting May 22, 2015 Tree of Life Tandy Warnow Warnow T. Concatenation Analyses in the Presence of Incomplete Lineage Sorting.. 2015 May 22.
More informationIdentifiability of latent class models with many observed variables
Identifiability of latent class models with many observed variables Elizabeth S. Allman Department of Mathematics and Statistics University of Alaska Fairbanks Fairbanks, AK 99775 e-mail: e.allman@uaf.edu
More informationarxiv: v1 [q-bio.pe] 3 May 2016
PHYLOGENETIC TREES AND EUCLIDEAN EMBEDDINGS MARK LAYER AND JOHN A. RHODES arxiv:1605.01039v1 [q-bio.pe] 3 May 2016 Abstract. It was recently observed by de Vienne et al. that a simple square root transformation
More informationAdditive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.
Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then
More informationPhylogenetics: Building Phylogenetic Trees
1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should
More informationA Bayesian Approach to Phylogenetics
A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationPhylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University
Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary
More informationDistances that Perfectly Mislead
Syst. Biol. 53(2):327 332, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490423809 Distances that Perfectly Mislead DANIEL H. HUSON 1 AND
More informationPhylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction
More informationOpen Problems in Algebraic Statistics
Open Problems inalgebraic Statistics p. Open Problems in Algebraic Statistics BERND STURMFELS UNIVERSITY OF CALIFORNIA, BERKELEY and TECHNISCHE UNIVERSITÄT BERLIN Advertisement Oberwolfach Seminar Algebraic
More informationCS 372: Computational Geometry Lecture 4 Lower Bounds for Computational Geometry Problems
CS 372: Computational Geometry Lecture 4 Lower Bounds for Computational Geometry Problems Antoine Vigneron King Abdullah University of Science and Technology September 20, 2012 Antoine Vigneron (KAUST)
More informationBayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies
Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies 1 What is phylogeny? Essay written for the course in Markov Chains 2004 Torbjörn Karfunkel Phylogeny is the evolutionary development
More informationPhylogenetic invariants versus classical phylogenetics
Phylogenetic invariants versus classical phylogenetics Marta Casanellas Rius (joint work with Jesús Fernández-Sánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya Algebraic
More informationarxiv:q-bio/ v5 [q-bio.pe] 14 Feb 2007
The Annals of Applied Probability 2006, Vol. 16, No. 4, 2215 2234 DOI: 10.1214/105051600000000538 c Institute of Mathematical Statistics, 2006 arxiv:q-bio/0505002v5 [q-bio.pe] 14 Feb 2007 LIMITATIONS OF
More informationLimitations of Markov Chain Monte Carlo Algorithms for Bayesian Inference of Phylogeny
Limitations of Markov Chain Monte Carlo Algorithms for Bayesian Inference of Phylogeny Elchanan Mossel Eric Vigoda July 5, 2005 Abstract Markov Chain Monte Carlo algorithms play a key role in the Bayesian
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationIdentifiability and Inference of Non-Parametric Rates-Across-Sites Models on Large-Scale Phylogenies
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 10-2013 Identifiability and Inference of Non-Parametric Rates-Across-Sites Models on Large-Scale Phylogenies Elchanan
More informationPhylogenetic Inference using RevBayes
Phylogenetic Inference using RevBayes Substitution Models Sebastian Höhna 1 Overview This tutorial demonstrates how to set up and perform analyses using common nucleotide substitution models. The substitution
More informationDNA-based species delimitation
DNA-based species delimitation Phylogenetic species concept based on tree topologies Ø How to set species boundaries? Ø Automatic species delimitation? druhů? DNA barcoding Species boundaries recognized
More informationarxiv: v1 [q-bio.pe] 23 Nov 2017
DIMENSIONS OF GROUP-BASED PHYLOGENETIC MIXTURES arxiv:1711.08686v1 [q-bio.pe] 23 Nov 2017 HECTOR BAÑOS, NATHANIEL BUSHEK, RUTH DAVIDSON, ELIZABETH GROSS, PAMELA E. HARRIS, ROBERT KRONE, COLBY LONG, ALLEN
More informationPhylogenetics: Likelihood
1 Phylogenetics: Likelihood COMP 571 Luay Nakhleh, Rice University The Problem 2 Input: Multiple alignment of a set S of sequences Output: Tree T leaf-labeled with S Assumptions 3 Characters are mutually
More informationarxiv: v1 [q-bio.pe] 4 Sep 2013
Version dated: September 5, 2013 Predicting ancestral states in a tree arxiv:1309.0926v1 [q-bio.pe] 4 Sep 2013 Predicting the ancestral character changes in a tree is typically easier than predicting the
More informationPhylogenetic Assumptions
Substitution Models and the Phylogenetic Assumptions Vivek Jayaswal Lars S. Jermiin COMMONWEALTH OF AUSTRALIA Copyright htregulation WARNING This material has been reproduced and communicated to you by
More informationBMI/CS 776 Lecture 4. Colin Dewey
BMI/CS 776 Lecture 4 Colin Dewey 2007.02.01 Outline Common nucleotide substitution models Directed graphical models Ancestral sequence inference Poisson process continuous Markov process X t0 X t1 X t2
More informationBayesian support is larger than bootstrap support in phylogenetic inference: a mathematical argument
Bayesian support is larger than bootstrap support in phylogenetic inference: a mathematical argument Tom Britton Bodil Svennblad Per Erixon Bengt Oxelman June 20, 2007 Abstract In phylogenetic inference
More informationQuartet Inference from SNP Data Under the Coalescent Model
Bioinformatics Advance Access published August 7, 2014 Quartet Inference from SNP Data Under the Coalescent Model Julia Chifman 1 and Laura Kubatko 2,3 1 Department of Cancer Biology, Wake Forest School
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationChapter 7: Models of discrete character evolution
Chapter 7: Models of discrete character evolution pdf version R markdown to recreate analyses Biological motivation: Limblessness as a discrete trait Squamates, the clade that includes all living species
More informationSpectral Theorem for Self-adjoint Linear Operators
Notes for the undergraduate lecture by David Adams. (These are the notes I would write if I was teaching a course on this topic. I have included more material than I will cover in the 45 minute lecture;
More informationStochastic Errors vs. Modeling Errors in Distance Based Phylogenetic Reconstructions
PLGW05 Stochastic Errors vs. Modeling Errors in Distance Based Phylogenetic Reconstructions 1 joint work with Ilan Gronau 2, Shlomo Moran 3, and Irad Yavneh 3 1 2 Dept. of Biological Statistics and Computational
More informationState Space and Hidden Markov Models
State Space and Hidden Markov Models Kunsch H.R. State Space and Hidden Markov Models. ETH- Zurich Zurich; Aliaksandr Hubin Oslo 2014 Contents 1. Introduction 2. Markov Chains 3. Hidden Markov and State
More informationGeometry of Phylogenetic Inference
Geometry of Phylogenetic Inference Matilde Marcolli CS101: Mathematical and Computational Linguistics Winter 2015 References N. Eriksson, K. Ranestad, B. Sturmfels, S. Sullivant, Phylogenetic algebraic
More informationDimension. Eigenvalue and eigenvector
Dimension. Eigenvalue and eigenvector Math 112, week 9 Goals: Bases, dimension, rank-nullity theorem. Eigenvalue and eigenvector. Suggested Textbook Readings: Sections 4.5, 4.6, 5.1, 5.2 Week 9: Dimension,
More information26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G.
10-708: Probabilistic Graphical Models, Spring 2015 26 : Spectral GMs Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G. 1 Introduction A common task in machine learning is to work with
More informationEstimating Evolutionary Trees. Phylogenetic Methods
Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent
More informationBayesian Phylogenetics
Bayesian Phylogenetics Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut Woods Hole Molecular Evolution Workshop, July 27, 2006 2006 Paul O. Lewis Bayesian Phylogenetics
More informationMixture Models in Phylogenetic Inference. Mark Pagel and Andrew Meade Reading University.
Mixture Models in Phylogenetic Inference Mark Pagel and Andrew Meade Reading University m.pagel@rdg.ac.uk Mixture models in phylogenetic inference!some background statistics relevant to phylogenetic inference!mixture
More informationPhylogenetic Graphical Models and RevBayes: Introduction. Fred(rik) Ronquist Swedish Museum of Natural History, Stockholm, Sweden
Phylogenetic Graphical Models and RevBayes: Introduction Fred(rik) Ronquist Swedish Museum of Natural History, Stockholm, Sweden Statistical Phylogenetics Statistical approaches increasingly important:
More informationPHYLOGENETIC ALGEBRAIC GEOMETRY
PHYLOGENETIC ALGEBRAIC GEOMETRY NICHOLAS ERIKSSON, KRISTIAN RANESTAD, BERND STURMFELS, AND SETH SULLIVANT Abstract. Phylogenetic algebraic geometry is concerned with certain complex projective algebraic
More informationBayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder
Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Note 2: Paul Lewis has written nice software for demonstrating Markov
More informationseries. Utilize the methods of calculus to solve applied problems that require computational or algebraic techniques..
1 Use computational techniques and algebraic skills essential for success in an academic, personal, or workplace setting. (Computational and Algebraic Skills) MAT 203 MAT 204 MAT 205 MAT 206 Calculus I
More informationEstimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057
Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number
More informationPenalized Likelihood Phylogenetic Inference: Bridging the Parsimony-Likelihood Gap
Syst. Biol. 57(5):665 674, 2008 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150802422274 Penalized Likelihood Phylogenetic Inference: Bridging
More informationPhylogenetics and Darwin. An Introduction to Phylogenetics. Tree of Life. Darwin s Trees
Phylogenetics and Darwin An Introduction to Phylogenetics Bret Larget larget@stat.wisc.edu Departments of Botany and of Statistics University of Wisconsin Madison February 4, 2008 A phylogeny is a tree
More informationarxiv: v1 [q-bio.pe] 16 Aug 2007
MAXIMUM LIKELIHOOD SUPERTREES arxiv:0708.2124v1 [q-bio.pe] 16 Aug 2007 MIKE STEEL AND ALLEN RODRIGO Abstract. We analyse a maximum-likelihood approach for combining phylogenetic trees into a larger supertree.
More informationReconstructing Trees from Subtree Weights
Reconstructing Trees from Subtree Weights Lior Pachter David E Speyer October 7, 2003 Abstract The tree-metric theorem provides a necessary and sufficient condition for a dissimilarity matrix to be a tree
More informationToday's project. Test input data Six alignments (from six independent markers) of Curcuma species
DNA sequences II Analyses of multiple sequence data datasets, incongruence tests, gene trees vs. species tree reconstruction, networks, detection of hybrid species DNA sequences II Test of congruence of
More informationUsing phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)
Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationConcepts and Methods in Molecular Divergence Time Estimation
Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks
More informationPhylogenetic inference
Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types
More informationMaximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018
Maximum Likelihood Tree Estimation Carrie Tribble IB 200 9 Feb 2018 Outline 1. Tree building process under maximum likelihood 2. Key differences between maximum likelihood and parsimony 3. Some fancy extras
More informationPreliminaries. Download PAUP* from: Tuesday, July 19, 16
Preliminaries Download PAUP* from: http://people.sc.fsu.edu/~dswofford/paup_test 1 A model of the Boston T System 1 Idea from Paul Lewis A simpler model? 2 Why do models matter? Model-based methods including
More informationIncomplete Lineage Sorting: Consistent Phylogeny Estimation From Multiple Loci
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1-2010 Incomplete Lineage Sorting: Consistent Phylogeny Estimation From Multiple Loci Elchanan Mossel University of
More informationTaming the Beast Workshop
Workshop and Chi Zhang June 28, 2016 1 / 19 Species tree Species tree the phylogeny representing the relationships among a group of species Figure adapted from [Rogers and Gibbs, 2014] Gene tree the phylogeny
More informationThe Effect of Ambiguous Data on Phylogenetic Estimates Obtained by Maximum Likelihood and Bayesian Inference
Syst. Biol. 58(1):130 145, 2009 Copyright c Society of Systematic Biologists DOI:10.1093/sysbio/syp017 Advance Access publication on May 21, 2009 The Effect of Ambiguous Data on Phylogenetic Estimates
More informationPhylogenetic Geometry
Phylogenetic Geometry Ruth Davidson University of Illinois Urbana-Champaign Department of Mathematics Mathematics and Statistics Seminar Washington State University-Vancouver September 26, 2016 Phylogenies
More informationLecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)
Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from
More informationOLIVIER SERMAN. Theorem 1.1. The moduli space of rank 3 vector bundles over a curve of genus 2 is a local complete intersection.
LOCAL STRUCTURE OF SU C (3) FOR A CURVE OF GENUS 2 OLIVIER SERMAN Abstract. The aim of this note is to give a precise description of the local structure of the moduli space SU C (3) of rank 3 vector bundles
More informationMathematical Biology. Phylogenetic mixtures and linear invariants for equal input models. B Mike Steel. Marta Casanellas 1 Mike Steel 2
J. Math. Biol. DOI 0.007/s00285-06-055-8 Mathematical Biology Phylogenetic mixtures and linear invariants for equal input models Marta Casanellas Mike Steel 2 Received: 5 February 206 / Revised: July 206
More informationSystematics - Bio 615
Bayesian Phylogenetic Inference 1. Introduction, history 2. Advantages over ML 3. Bayes Rule 4. The Priors 5. Marginal vs Joint estimation 6. MCMC Derek S. Sikes University of Alaska 7. Posteriors vs Bootstrap
More informationFour Point Gauss Quadrature Runge Kuta Method Of Order 8 For Ordinary Differential Equations
International journal of scientific and technical research in engineering (IJSTRE) www.ijstre.com Volume Issue ǁ July 206. Four Point Gauss Quadrature Runge Kuta Method Of Order 8 For Ordinary Differential
More informationLecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26
Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 4 (Models of DNA and
More informationHomoplasy. Selection of models of molecular evolution. Evolutionary correction. Saturation
Homoplasy Selection of models of molecular evolution David Posada Homoplasy indicates identity not produced by descent from a common ancestor. Graduate class in Phylogenetics, Campus Agrário de Vairão,
More informationAnatomy of a species tree
Anatomy of a species tree T 1 Size of current and ancestral Populations (N) N Confidence in branches of species tree t/2n = 1 coalescent unit T 2 Branch lengths and divergence times of species & populations
More informationMarkov chain Monte-Carlo to estimate speciation and extinction rates: making use of the forest hidden behind the (phylogenetic) tree
Markov chain Monte-Carlo to estimate speciation and extinction rates: making use of the forest hidden behind the (phylogenetic) tree Nicolas Salamin Department of Ecology and Evolution University of Lausanne
More informationTHE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT
COMMUNICATIONS IN INFORMATION AND SYSTEMS c 2009 International Press Vol. 9, No. 4, pp. 295-302, 2009 001 THE THREE-STATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2-SAT DAN GUSFIELD AND YUFENG WU Abstract.
More informationV (v i + W i ) (v i + W i ) is path-connected and hence is connected.
Math 396. Connectedness of hyperplane complements Note that the complement of a point in R is disconnected and the complement of a (translated) line in R 2 is disconnected. Quite generally, we claim that
More informationTensors. Lek-Heng Lim. Statistics Department Retreat. October 27, Thanks: NSF DMS and DMS
Tensors Lek-Heng Lim Statistics Department Retreat October 27, 2012 Thanks: NSF DMS 1209136 and DMS 1057064 L.-H. Lim (Stat Retreat) Tensors October 27, 2012 1 / 20 tensors on one foot a tensor is a multilinear
More informationBIG4: Biosystematics, informatics and genomics of the big 4 insect groups- training tomorrow s researchers and entrepreneurs
BIG4: Biosystematics, informatics and genomics of the big 4 insect groups- training tomorrow s researchers and entrepreneurs Kick-Off Meeting 14-18 September 2015 Copenhagen, Denmark This project has received
More information