Reconstruire le passé biologique modèles, méthodes, performances, limites


 Egbert Atkins
 2 years ago
 Views:
Transcription
1 Reconstruire le passé biologique modèles, méthodes, performances, limites Olivier Gascuel Centre de Bioinformatique, Biostatistique et Biologie Intégrative C3BI USR 3756 Institut Pasteur & CNRS
2 Reconstruire le passé biologique modèles, méthodes, performances, limites O. Gascuel, M. Steel Inferring ancestral sequences in taxonrich phylogenies. Mathematical Biosciences 227(2): O. Gascuel, M. Steel Predicting the ancestral character changes in a tree is typically easier than predicting the root state. Systematic Biology, 63(3):
3 Reconstruire le passé biologique modèles, méthodes, performances, limites Focus on characters rather than trees An introduction to phylogenetics * Motivations * Tree models * Sequence and character evolution models * Ancestral inference methods Inferring the tree root Inferring character changes Uncertainty principle, perpectives
4 Darwin (1837)
5 Haeckel (1875)
6 The Tree of Life
7 A growing impact. Nothing in Biology Makes Sense Except in the Light of Evolution T. Dobzhansky ,000
8 Inferring the root character? A C A T C A G C
9 Broadcasting on trees? Ising model
10 (Adv. App. Prob. 2000)
11 (Adv. App. Prob. 2000) Uniform error probability No branch length (time duration) The tree is fixed, not random
12 Inferring all character changes??????? A C A T C A G C
13 Parallel Adaptations to High Temperatures in the Archean Eon Boussau*, Blanquart* et al Nature 2008
14 HIV1 subtype A Eastern & Southern Europe (Chevenet et al. Bioinformatics 2013)
15 HIV1 subtype C
16 Phylogenetic tree models Unoriented, labelled, binary tree Mathematical expression Search tree Orangutan Gorilla Bonobo Chimpanze Human
17 Phylogenetic tree models Unoriented, labelled, binary tree Mathematical expression Search tree Orangutan Gorilla Human Chimpanze Bonobo
18 Phylogenetic tree models Rooted tree Time dimension (difficult to infer) Orangutan Gorilla Bonobo Chimpanze Human
19 Phylogenetic tree models O(n n ) topologies Orangutan Gorilla Bonobo Chimpanze Human
20 YuleHarding (YH) speciation model  Topology (1924, 1971) An initial species (leaf) Until we obtain n extant species, randomly select a leaf in the growing tree, and speciate that (ancestral) species into 2 new species Labels are uniformly assigned to the tree leaves Robust to extinction and sampling
21 YH distribution is not uniform Expected number of cherries: n/3 versus n/4 Expected diameter: O(log(n)) versus O(sqrt(n)) Diameter (n = 95) YH Uniform (PDA)
22 YuleHarding model with timevalued edges The speciation time on a given branch follows an exponential law (without memory) of parameter (expectation = 1 / ) 1/ 1/ 1/ 1/ 1/
23 YuleHarding model with timevalued edges The minimum of k independent, exponential laws is an exponential law with parameter k 1/ 1/2 O(1/n 1/3 1/4 1/5 1/6 1/7 1/8 1/9
24 YuleHarding model with timevalued edges The minimum of k independent, exponential laws is an exponential law with parameter k (many other more sophisticated models) 1/ 1/2 O(1/n 1/3 1/4 1/5 1/6 1/7 1/8 1/9
25 Modeling sequence (and character) evolution We aim at explaining the data (alignment) using a probablistic scenario of the evolution of the sites along a phylogeny
26 Modeling sequence (character) evolution A A C
27 Modeling sequence (character) evolution A A A C
28 Modeling sequence (character) evolution A or C A A A C Parsimony
29 Modeling sequence (character) evolution A or C A ACGT ACGT A A C Parsimony Probabilistic modelling
30 Modeling sequence evolution: standard assumptions Evolution is independent among lineages Evolution is memoryless (Markov model) The sites evolve independently and identically Models are time reversible Models are time homogeneous and stationary
31 The simplest RY (0,1) symmetrical Markov, time continuous model R μ Y R Y R. expected number of mutations = t
32 The simplest RY (0,1) symmetrical Markov, time continuous model R μ Y The rate matrix: Q The matrix of probability changes: P t e Qt P t t 2t 1e 1e 2t 2t 1e 1e The equilibrium distribution: R Y 1 2
33 The simplest RY (0,1) symmetrical Markov, time continuous model The rate matrix: Q The matrix of probability changes: The equilibrium distribution: R Y P t t 2t 1e 1e 2t 2t 1e 1e This model is timereversible: X XY Y YX P t P t We assume stationnarity (frequencies of R and Y are nearly equal)
34 Jukes and Cantor model (JC69) for DNA M = A T C G A T C G Eq. (1/4, 1/4, 1/4, 1/4)
35 Felsenstein 1981 (F81) model for DNA A T C G M = A T C G T A C G C A T G G A T C Eq. ( A, T, C, G ) Felsenstein s 1981 model allows for any arbitrary set of equilibrium frequencies.
36 Kimura 2parameter (K2P) model for DNA M = A T C G A T C G Eq. (1/4, 1/4, 1/4, 1/4) Kimura s 2parameter model aims at reflecting the fact that transitions are more frequent than transitions
37 Hasegawa, Kishino,Yano (HKY) model for DNA A T C G M = A T C G T A C G C A T G G A T C Eq. ( A, T, C, G ) The HKY model is a way to incorporate both transition/transversion bias and an arbitrary set of equilibrium frequencies. It captures the two main aspects of DNA evolution.
38 Reconstruction methods: Majority The majority state at the tree tips is predicted (no knowledge of the tree or the model)
39 Reconstruction Methods: Parsimony We minimize the number of changes along tree branches (no knowledge of the model and time duration) 1 st : Recursive postorder calculation (bottomup)
40 Reconstruction Methods: Parsimony We minimize the number of changes along tree branches (no knowledge of the model and time duration) 2 nd : Recursive preorder calculation (topdown)
41 Maximum Likelihood Require to know (or estimate) the tree, branch lengths, and model parameter values One predicts the maximum posterior probability (MAP) Best possible method!
42 Maximum Likelihood Recursive postorder calculation marginal likelihood of the root states u T v L h T M P h h u M L h U M h' AC,, G, T, ', ', h' AC,, G, T ', ', P h h v M L h V M U V For a tip: L h U, M 1if U h, else 0
43 Maximum Likelihood Computation of the best scenario: We apply (independently!) the same algorithm to every internal nodes Marginal posterior for each We use a dynamic programming approach (Pupko et al. 2000) to compute the scenario with maximal joint probability (but exponential number of scenarios )
44 Results (OG & Steel ) What part of the past is reconstructible? (PAC etc) Can we compare the different methods? (simulations) In this presentation: RY (0/1) symmetric model () YuleHarding model with t (or n) ( is key) (in which condition the past disappear?) YuleHarding trees with t fixed and (impact of the sample size?) Extreme trees (examples and counterexamples)
45 Root state: fundamental limitation YuleHarding trees (YH) with t (or n) For any root prediction method M 1if42:predictiveaccuracyofrootreconstructionwith:speciationrate:substitutionrateMMPAPAMOlivier Gascuel Reconstruire le passé biologique  Polytechnique, Novembre 2017
46 Root state: fundamental limitation YuleHarding trees (YH) with t (or n) Pif4M 2 1 A1I;LM Mutual information I(;L) and accuracy PA : PAleafsetL nxp4texpt4 et Mutual information erosion with time I;L
47 Root state : parsimony limitation YuleHarding trees (YH) with t Pif6Parsimony 2 1A PAOlivier Gascuel Reconstruire le passé biologique  Polytechnique, Novembre 2017
48 Root state : Majority rule and MAP (Mossel and Steel 2014) YH trees with t Majority has best possible bound: / = 4 Thus, the same holds for MAP
49 Root state : Majority rule p 1 YH tree, fixed t and : Majority PA With a conservative model: P ii (t) > P ij (t) If i is the root state, we expect a majority of i among tips. But the tree paths are not independent!
50 Root state : Majority rule Star tree: independent paths, laws of large numbers? AA AG AC, AT A C A T C A G A
51 Root state : Majority rule What is the spread of YuleHarding trees?
52 Root state : Majority rule Spread index: l YX ST ( ) xy, l xy n n1 t X Y Theorem: for any fixed and t, the probability that S(T) is larger than tends to 0 when (speciation rate) Then, T becomes close to a star tree, and the accuracy of the majority rule converges to 1.
53 rsi1olivier Gascuel Reconstruire le passé biologique  Polytechnique, Novembre 2017 Root state : Parsimony and MAP With YH trees, fixed t and : PAPamony MAP 1pPandA p Realistic simulations MAP > Majority > Parsimony Majority is affected by potential sampling biases MAP and Parsimony are surprisingly robust
54 Root/Internal nodes: not that simple! Root: Yes Nodes: No Root: No Nodes: Yes
55 (ed)olivier Gascuel Reconstruire le passé biologique  Polytechnique, Novembre 2017 Internal nodes and YH trees: YuleHarding trees with t (or n) PAv 21,fix
56 (ed)olivier Gascuel Reconstruire le passé biologique  Polytechnique, Novembre 2017 Internal nodes and YH trees: YuleHarding trees with t (or n) PAv 21,fix At least half of the nodes are connected to a tip. In a time conditioned YH tree, the expected length of pending branches is 1 / 2 Thus, the mutual information is > 0 and the predictability > 1/2
57 (ed)olivier Gascuel Reconstruire le passé biologique  Polytechnique, Novembre 2017 Internal nodes and YH trees: YuleHarding trees with t (or n) PAv 21,fix Strong contrast with the tree root A 2 (but no quantitative input ) 1if4P
58 Realistic simulation results: PHKY+nAOlivier Gascuel Reconstruire le passé biologique  Polytechnique, Novembre 2017
59 Discussion Internal nodes are much easier than the tree root Method robustness: model violation (ML, MP, MAJ), sampling bias (ML and MP), tree uncertainty (all) In phylogeography we predict well the flows among countries. (but not the tree root) Lakner et al. (2010) results demonstrate that these methods predict stable/credible ancestral proteins. (but not the future ;))
arxiv: v1 [qbio.pe] 4 Sep 2013
Version dated: September 5, 2013 Predicting ancestral states in a tree arxiv:1309.0926v1 [qbio.pe] 4 Sep 2013 Predicting the ancestral character changes in a tree is typically easier than predicting the
More informationDr. Amira A. ALHosary
Phylogenetic analysis Amira A. ALHosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut UniversityEgypt Phylogenetic Basics: Biological
More informationAmira A. ALHosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. ALHosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut UniversityEgypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationPhylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Distance Methods COMP 571  Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distancebased methods Evolutionary Models and Distance Correction
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics  in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa.  before we review the
More informationHow should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?
How should we go about modeling this? gorilla GAAGTCCTTGAGAAATAAACTGCACACACTGG orangutan GGACTCCTTGAGAAATAAACTGCACACACTGG Model parameters? Time Substitution rate Can we observe time or subst. rate? What
More informationAdditive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.
Additive distances Let T be a tree on leaf set S and let w : E R + be an edgeweighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then
More informationPhylogenetics: Parsimony and Likelihood. COMP Spring 2016 Luay Nakhleh, Rice University
Phylogenetics: Parsimony and Likelihood COMP 571  Spring 2016 Luay Nakhleh, Rice University The Problem Input: Multiple alignment of a set S of sequences Output: Tree T leaflabeled with S Assumptions
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istancebased methods Ultrametric Additive: UPGMA Transformed istance NeighborJoining Characterbased Maximum Parsimony Maximum Likelihood
More informationMolecular Evolution, course # Final Exam, May 3, 2006
Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least
More informationPhylogenetics: Likelihood
1 Phylogenetics: Likelihood COMP 571 Luay Nakhleh, Rice University The Problem 2 Input: Multiple alignment of a set S of sequences Output: Tree T leaflabeled with S Assumptions 3 Characters are mutually
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods
More information9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)
I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by
More informationSubstitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A
GAGATC 3:G A 6:C T Common Ancestor ACGATC 1:A G 2:C A Substitution = Mutation followed 5:T C by Fixation GAAATT 4:A C 1:G A AAAATT GAAATT GAGCTC ACGACC Chimp Human Gorilla Gibbon AAAATT GAAATT GAGCTC ACGACC
More informationPhylogenetic inference
Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis) advantages of different information types
More informationarxiv: v1 [qbio.pe] 1 Jun 2014
THE MOST PARSIMONIOUS TREE FOR RANDOM DATA MAREIKE FISCHER, MICHELLE GALLA, LINA HERBST AND MIKE STEEL arxiv:46.27v [qbio.pe] Jun 24 Abstract. Applying a method to reconstruct a phylogenetic tree from
More informationBMI/CS 776 Lecture 4. Colin Dewey
BMI/CS 776 Lecture 4 Colin Dewey 2007.02.01 Outline Common nucleotide substitution models Directed graphical models Ancestral sequence inference Poisson process continuous Markov process X t0 X t1 X t2
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationC.DARWIN ( )
C.DARWIN (18091882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships
More information"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky
MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION  theory that groups of organisms change over time so that descendeants differ structurally
More informationMaximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington.
Maximum Likelihood This presentation is based almost entirely on Peter G. Fosters  "The Idiot s Guide to the Zen of Likelihood in a Nutshell in Seven Days for Dummies, Unleashed. http://www.bioinf.org/molsys/data/idiots.pdf
More informationInferring Molecular Phylogeny
Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction
More informationPhylogeny of Mixture Models
Phylogeny of Mixture Models Daniel Štefankovič Department of Computer Science University of Rochester joint work with Eric Vigoda College of Computing Georgia Institute of Technology Outline Introduction
More informationSequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment
Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the loglikelihood ratio of
More informationBINF6201/8201. Molecular phylogenetic methods
BINF60/80 Molecular phylogenetic methods 0706 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics
More informationPhylogenetics: Building Phylogenetic Trees
1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should
More informationLecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).
1 Bioinformatics: Indepth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff
More informationLecture 4. Models of DNA and protein change. Likelihood methods
Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/36
More informationPhylogenetics: Parsimony
1 Phylogenetics: Parsimony COMP 571 Luay Nakhleh, Rice University he Problem 2 Input: Multiple alignment of a set S of sequences Output: ree leaflabeled with S Assumptions Characters are mutually independent
More informationπ b = a π a P a,b = Q a,b δ + o(δ) = 1 + Q a,a δ + o(δ) = I 4 + Qδ + o(δ),
ABC estimation of the scaled effective population size. Geoff Nicholls, DTC 07/05/08 Refer to http://www.stats.ox.ac.uk/~nicholls/dtc/tt08/ for material. We will begin with a practical on ABC estimation
More informationPhylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University
Phylogenetics: Building Phylogenetic Trees COMP 571  Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, MariePauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationPhylogenetic Assumptions
Substitution Models and the Phylogenetic Assumptions Vivek Jayaswal Lars S. Jermiin COMMONWEALTH OF AUSTRALIA Copyright htregulation WARNING This material has been reproduced and communicated to you by
More informationEstimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6057
Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 42818 Jordan 6057 Tree estimation strategies: Parsimony?no model, simply count minimum number
More informationProbabilistic modeling and molecular phylogeny
Probabilistic modeling and molecular phylogeny Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) What is a model? Mathematical
More informationLecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)
Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distancebased methods Ultrametric Additive: UPGMA Transformed Distance NeighborJoining Characterbased Maximum Parsimony Maximum Likelihood
More informationMolecular Evolution and Phylogenetic Tree Reconstruction
1 4 Molecular Evolution and Phylogenetic Tree Reconstruction 3 2 5 1 4 2 3 5 Orthology, Paralogy, Inparalogs, Outparalogs Phylogenetic Trees Nodes: species Edges: time of independent evolution Edge length
More information进化树构建方法的概率方法 第 4 章 : 进化树构建的概率方法 问题介绍. 部分 lid 修改自 i i f l 的 ih l i
第 4 章 : 进化树构建的概率方法 问题介绍 进化树构建方法的概率方法 部分 lid 修改自 i i f l 的 ih l i 部分 Slides 修改自 University of Basel 的 Michael Springmann 课程 CS302 Seminar Life Science Informatics 的讲义 Phylogenetic Tree branch internal node
More informationMaximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018
Maximum Likelihood Tree Estimation Carrie Tribble IB 200 9 Feb 2018 Outline 1. Tree building process under maximum likelihood 2. Key differences between maximum likelihood and parsimony 3. Some fancy extras
More informationTheDiskCovering MethodforTree Reconstruction
TheDiskCovering MethodforTree Reconstruction Daniel Huson PACM, Princeton University Bonn, 1998 1 Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document
More informationLecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26
Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 4 (Models of DNA and
More informationUsing algebraic geometry for phylogenetic reconstruction
Using algebraic geometry for phylogenetic reconstruction Marta Casanellas i Rius (joint work with Jesús FernándezSánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya IMA
More informationConcepts and Methods in Molecular Divergence Time Estimation
Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks
More informationEvolutionary trees. Describe the relationship between objects, e.g. species or genes
Evolutionary trees Bonobo Chimpanzee Human Neanderthal Gorilla Orangutan Describe the relationship between objects, e.g. species or genes Early evolutionary studies The evolutionary relationships between
More informationEstimating Evolutionary Trees. Phylogenetic Methods
Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent
More informationEvolutionary Tree Analysis. Overview
CSI/BINF 5330 Evolutionary Tree Analysis YoungRae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds DistanceBased Evolutionary Tree Reconstruction CharacterBased
More informationTaming the Beast Workshop
Workshop and Chi Zhang June 28, 2016 1 / 19 Species tree Species tree the phylogeny representing the relationships among a group of species Figure adapted from [Rogers and Gibbs, 2014] Gene tree the phylogeny
More informationImproving divergence time estimation in phylogenetics: more taxa vs. longer sequences
Mathematical Statistics Stockholm University Improving divergence time estimation in phylogenetics: more taxa vs. longer sequences Bodil Svennblad Tom Britton Research Report 2007:2 ISSN 6500377 Postal
More informationPhylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center
Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distancebased methods
More informationPhylogenetics. BIOL 7711 Computational Bioscience
Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationLetter to the Editor. Department of Biology, Arizona State University
Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona
More informationChapter 7: Models of discrete character evolution
Chapter 7: Models of discrete character evolution pdf version R markdown to recreate analyses Biological motivation: Limblessness as a discrete trait Squamates, the clade that includes all living species
More informationThe statistical and informatics challenges posed by ascertainment biases in phylogenetic data collection
The statistical and informatics challenges posed by ascertainment biases in phylogenetic data collection Mark T. Holder and Jordan M. Koch Department of Ecology and Evolutionary Biology, University of
More informationBayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies
Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies 1 What is phylogeny? Essay written for the course in Markov Chains 2004 Torbjörn Karfunkel Phylogeny is the evolutionary development
More informationAlgebraic Statistics Tutorial I
Algebraic Statistics Tutorial I Seth Sullivant North Carolina State University June 9, 2012 Seth Sullivant (NCSU) Algebraic Statistics June 9, 2012 1 / 34 Introduction to Algebraic Geometry Let R[p] =
More informationLikelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution
Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution Ziheng Yang Department of Biology, University College, London An excess of nonsynonymous substitutions
More informationLecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22
Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 24. Phylogeny methods, part 4 (Models of DNA and
More informationReconstruction of certain phylogenetic networks from their treeaverage distances
Reconstruction of certain phylogenetic networks from their treeaverage distances Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu October 10,
More informationTheory of Evolution Charles Darwin
Theory of Evolution Charles arwin 85859: Origin of Species 5 year voyage of H.M.S. eagle (8336) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationLie Markov models. Jeremy Sumner. School of Physical Sciences University of Tasmania, Australia
Lie Markov models Jeremy Sumner School of Physical Sciences University of Tasmania, Australia Stochastic Modelling Meets Phylogenetics, UTAS, November 2015 Jeremy Sumner Lie Markov models 1 / 23 The theory
More information1. Can we use the CFN model for morphological traits?
1. Can we use the CFN model for morphological traits? 2. Can we use something like the GTR model for morphological traits? 3. Stochastic Dollo. 4. Continuous characters. Mk models kstate variants of the
More informationFrom Individualbased Population Models to Lineagebased Models of Phylogenies
From Individualbased Population Models to Lineagebased Models of Phylogenies Amaury Lambert (joint works with G. Achaz, H.K. Alexander, R.S. Etienne, N. Lartillot, H. Morlon, T.L. Parsons, T. Stadler)
More informationConsistency Index (CI)
Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)
More informationMassachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution
Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral
More informationLecture Notes: Markov chains
Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity
More informationHMM for modeling aligned multiple sequences: phylohmm & multivariate HMM
I529: Machine Learning in Bioinformatics (Spring 2017) HMM for modeling aligned multiple sequences: phylohmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington
More informationMolecular Evolution & Phylogenetics
Molecular Evolution & Phylogenetics Heuristics based on tree alterations, maximum likelihood, Bayesian methods, statistical confidence measures JeanBaka Domelevo Entfellner Learning Objectives know basic
More informationPhylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.
Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony
More informationPhylogenetic invariants versus classical phylogenetics
Phylogenetic invariants versus classical phylogenetics Marta Casanellas Rius (joint work with Jesús FernándezSánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya Algebraic
More informationLecture 6 Phylogenetic Inference
Lecture 6 Phylogenetic Inference From Darwin s notebook in 1837 Charles Darwin Willi Hennig From The Origin in 1859 Cladistics Phylogenetic inference Willi Hennig, Cladistics 1. Clade, Monophyletic group,
More informationEvolutionary Models. Evolutionary Models
Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment
More information3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylohmm & multivariate HMM
I529: Machine Learning in Bioinformatics (Spring 2017) Content HMM for modeling aligned multiple sequences: phylohmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University,
More informationThe Generalized Neighbor Joining method
The Generalized Neighbor Joining method Ruriko Yoshida Dept. of Mathematics Duke University Joint work with Dan Levy and Lior Pachter www.math.duke.edu/ ruriko data mining 1 Challenge We would like to
More informationDiscrete & continuous characters: The threshold model
Discrete & continuous characters: The threshold model Discrete & continuous characters: the threshold model So far we have discussed continuous & discrete character models separately for estimating ancestral
More informationAlgorithmic Methods Welldefined methodology Tree reconstruction those that are welldefined enough to be carried out by a computer. Felsenstein 2004,
Tracing the Evolution of Numerical Phylogenetics: History, Philosophy, and Significance Adam W. Ferguson Phylogenetic Systematics 26 January 2009 Inferring Phylogenies Historical endeavor Darwin 1837
More informationIntegrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley. Parsimony & Likelihood [draft]
Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley K.W. Will Parsimony & Likelihood [draft] 1. Hennig and Parsimony: Hennig was not concerned with parsimony
More informationIdentifiability of the GTR+Γ substitution model (and other models) of DNA evolution
Identifiability of the GTR+Γ substitution model (and other models) of DNA evolution Elizabeth S. Allman Dept. of Mathematics and Statistics University of Alaska Fairbanks TM Current Challenges and Problems
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley B.D. Mishler Jan. 22, 2009. Trees I. Summary of previous lecture: Hennigian
More informationPhylogeny: building the tree of life
Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan
More informationA Phylogenetic Network Construction due to Constrained Recombination
A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer
More informationMinimum evolution using ordinary leastsquares is less robust than neighborjoining
Minimum evolution using ordinary leastsquares is less robust than neighborjoining Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA email: swillson@iastate.edu November
More informationIs the equal branch length model a parsimony model?
Table 1: n approximation of the probability of data patterns on the tree shown in figure?? made by dropping terms that do not have the minimal exponent for p. Terms that were dropped are shown in red;
More informationMaximum Likelihood in Phylogenetics
Maximum Likelihood in Phylogenetics June 1, 2009 Smithsonian Workshop on Molecular Evolution Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut, Storrs, CT Copyright 2009
More informationInferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies
Inferring Phylogenetic Trees Distance Approaches Representing distances in rooted and unrooted trees The distance approach to phylogenies given: an n n matrix M where M ij is the distance between taxa
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationWeek 5: Distance methods, DNA and protein models
Week 5: Distance methods, DNA and protein models Genome 570 February, 2016 Week 5: Distance methods, DNA and protein models p.1/69 A tree and the expected distances it predicts E A 0.08 0.05 0.06 0.03
More informationMolecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences
Molecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences Basic Bioinformatics Workshop, ILRI Addis Ababa, 12 December 2017 1 Learning Objectives
More informationBIOL 1010 Introduction to Biology: The Evolution and Diversity of Life. Spring 2011 Sections A & B
BIOL 1010 Introduction to Biology: The Evolution and Diversity of Life. Spring 2011 Sections A & B Steve Thompson: stthompson@valdosta.edu http://www.bioinfo4u.net 1 ʻTree of Life,ʼ ʻprimitive,ʼ ʻprogressʼ
More informationProbability Distribution of Molecular Evolutionary Trees: A New Method of Phylogenetic Inference
J Mol Evol (1996) 43:304 311 SpringerVerlag New York Inc. 1996 Probability Distribution of Molecular Evolutionary Trees: A New Method of Phylogenetic Inference Bruce Rannala, Ziheng Yang Department of
More informationStochastic Errors vs. Modeling Errors in Distance Based Phylogenetic Reconstructions
PLGW05 Stochastic Errors vs. Modeling Errors in Distance Based Phylogenetic Reconstructions 1 joint work with Ilan Gronau 2, Shlomo Moran 3, and Irad Yavneh 3 1 2 Dept. of Biological Statistics and Computational
More informationBioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics
Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods
More informationUsing phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)
Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures
More informationEvolutionary trees. Describe the relationship between objects, e.g. species or genes
Evolutionary trees Bonobo Chimpanzee Human Neanderthal Gorilla Orangutan Describe the relationship between objects, e.g. species or genes Early evolutionary studies Anatomical features were the dominant
More informationProperties of normal phylogenetic networks
Properties of normal phylogenetic networks Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu August 13, 2009 Abstract. A phylogenetic network is
More informationInferring Speciation Times under an Episodic Molecular Clock
Syst. Biol. 56(3):453 466, 2007 Copyright c Society of Systematic Biologists ISSN: 10635157 print / 1076836X online DOI: 10.1080/10635150701420643 Inferring Speciation Times under an Episodic Molecular
More informationWhat is Phylogenetics
What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)
More information