Methods to reconstruct phylogene1c networks accoun1ng for ILS
|
|
- Abel Horatio Williams
- 5 years ago
- Views:
Transcription
1 Methods to reconstruct phylogene1c networks accoun1ng for ILS Céline Scornavacca some slides have been kindly provided by Fabio Pardi ISE-M, Equipe Phylogénie & Evolu1on Moléculaires Montpellier, France February the 2nd, 2017
2 Phylogene1c networks Trees displayed by a network In a phylogene1c network, a re1culate event is represented as a re1cula1on, where branches converge to give rise to a new lineage: a b c d e f
3 Phylogene1c networks Trees displayed by a network In a phylogene1c network, a re1culate event is represented as a re1cula1on, where branches converge to give rise to a new lineage: The genome(s) at the start of the new lineage is (are) a composi1on of those of the parent lineages. a b c d e f
4 Phylogene1c networks Trees displayed by a network In a phylogene1c network, a re1culate event is represented as a re1cula1on, where branches converge to give rise to a new lineage: The genome(s) at the start of the new lineage is (are) a composi1on of those of the parent lineages. The evolu1on of each part independently inherited is described by a gene tree a b c d e f
5 Phylogene1c networks Trees displayed by a network In a phylogene1c network, a re1culate event is represented as a re1cula1on, where branches converge to give rise to a new lineage: The genome(s) at the start of the new lineage is (are) a composi1on of those of the parent lineages. The evolu1on of each part independently inherited is described by a gene tree a b c d e f a b c d e f
6 Phylogene1c networks Trees displayed by a network In a phylogene1c network, a re1culate event is represented as a re1cula1on, where branches converge to give rise to a new lineage: The genome(s) at the start of the new lineage is (are) a composi1on of those of the parent lineages. The evolu1on of each part independently inherited is described by a gene tree a b c d e f a b c d e f a b c d e f
7 Phylogene1c networks Trees displayed by a network In a phylogene1c network, a re1culate event is represented as a re1cula1on, where branches converge to give rise to a new lineage: The genome(s) at the start of the new lineage is (are) a composi1on of those of the parent lineages. The evolu1on of each part independently inherited is described by a gene tree a b c d e f In the absence of deep coalescence and allopolyploidy, the gene trees are a subset of the trees displayed by the network a b c d e f a b c d e f
8 Phylogene1c networks Phylogene1c network inference N a b c d e f?
9 Phylogene1c networks Phylogene1c network inference An op1miza1on problem where a candidate network is evaluated on the basis of how well the trees it displays fit the data: N a b c d e f Many possible formula4ons: a b c d e f a b c d e f Data: Trees with 3 taxa: (inferred from other data) a b c Goal: Find the network N with the lower hybridiza1on number such that the triplets are `consistent with one of the trees displayed by N subject to constraints on the complexity of N c f a d e f d f b a c d...
10 Phylogene1c networks Phylogene1c network inference An op1miza1on problem where a candidate network is evaluated on the basis of how well the trees it displays fit the data: N a b c d e f Many possible formula4ons: a b c d e f a b c d e f Data: Any trees on the same taxa: (inferred from other data) Goal: Find the network N with the lower hybridiza1on number such that the input trees are `consistent with one of the trees displayed by N subject to constraints on the complexity of N a c d e f c f a b d e f...
11 Phylogene1c networks Phylogene1c network inference An op1miza1on problem where a candidate network is evaluated on the basis of how well the trees it displays fit the data: N a b c d e f Many possible formula4ons: a b c d e f a b c d e f Data: Clusters of taxa: {a, b}, {d, e}, {d, e, f}, {a, b, c, d, e, f}, {e, f}, {c, d, e, f},... Goal: Find the network N with the lower hybridiza1on number such that the input clusters are `explained by one of the trees displayed by N subject to constraints on the complexity of N
12 Phylogene1c networks Phylogene1c network inference An op1miza1on problem where a candidate network is evaluated on the basis of how well the trees it displays fit the data: N a b c d e f Many possible formula4ons: a b c d e f a b c d e f Data: Sequence alignments: (typically given in blocks) T (N) : Goal: mx Find N that minimizes F (N A 1,A 2,...,A m )= min F (T A i) T 2T (N) i=1 subject to constraints on the complexity of N. F() is the parsimony score. Jin et al. Parsimony Score of Phylogene1c Networks: Hardness Results and a Linear-Time Heuris1c. TCCB
13 Jin et al.maximum likelihood of phylogene1c networks. Bioinforma1cs Phylogene1c networks Phylogene1c network inference An op1miza1on problem where a candidate network is evaluated on the basis of how well the trees it displays fit the data: N p 1-p a b c d e f T (N) : Many possible formula4ons: a b c d e f a b c d e f Data: Sequence alignments: (typically given in blocks) Goal: Find N that maximises 0 1 my my Pr(A 1,A 2,...,A m N) = Pr(A i N) X Pr(A i T )Pr(T N) A... i=1 i=1 T 2T (N)
14 Jin et al.maximum likelihood of phylogene1c networks. Bioinforma1cs Phylogene1c networks Phylogene1c network inference An op1miza1on problem where a candidate network is evaluated on the basis of how well the trees it displays fit the data: h^ps://phylomeme1c.wordpress.com/2015/04/17/phylodag/ h^p://old-bioinfo.cs.rice.edu/nepal/ T (N) : p 1-p N a b c d e f Many possible formula4ons: a b c d e f a b c d e f Data: Sequence alignments: (typically given in blocks) Goal: Find N that maximises 0 1 my my Pr(A 1,A 2,...,A m N) = Pr(A i N) X Pr(A i T )Pr(T N) A... i=1 i=1 T 2T (N)
15 Problems Some issues Searching the space of phylogene1c networks The space of networks with k re1cula1ons is infinite. Controlling for Model Complexity Because any network with k re1cula1ons provides a more complex model than any network with (k-1) re1cula1ons, we must handle the model selec1on problem (AIC, BIC, K-fold cross-valida1on, ). Iden1fiability issues Pr(A 1,A 2,...,A m N) = my Pr(A i N) = i=1 0 Not accoun1ng for ILS and allopolyploidy i=1 X T 2T (N) 1 Pr(A i T )Pr(T N) A
16 Iden1fiability problems Different networks can display the same trees Some networks display exactly the same trees: N 1 N 2 d d a b c a b c T 1 T 2 T 3 d d d a b c a b c a b c
17 Iden1fiability problems Different networks can display the same trees Some networks display exactly the same trees: N 1 N 2 Because N 1 and N 2 display the same trees, they are equally good to any of the inference methods we saw no ma9er the input data T 1 a d d b c a b T 2 T 3 d d d c a b c a b c a b c (Recall that a network is evaluated on the basis of how well the trees it displays fit the data) Data
18 Iden1fiability problems Different networks can display the same trees Some networks display exactly the same trees: N 1 N 2 Because N 1 and N 2 display the same trees, they are equally good to any of the inference methods we saw no ma9er the input data T 1 a d d d b c a b T 2 T 3 UNIDENTIFIABILITY d d c a b c a b c a b c
19 Iden1fiability problems Indis1nguishable networks Branch lengths do not eliminate noniden1fiability N 1 N a b c d e a b c d e a b c d e a b c d e a d b c e N 1 and N 2 display the same trees (i.e. including branch lengths) and are thus indis4nguishable even to methods accoun1ng for lengths
20 Iden1fiability problems Indis1nguishable networks Branch lengths do not eliminate noniden1fiability The same hold for inheritance probabili1es 5 p 1-p N 1 N a b c d e a b c d e a b c d e a b c d e a d b c e N 1 and N 2 display the same trees (i.e. including branch lengths) and are thus indis4nguishable even to methods accoun1ng for lengths
21 Canonical networks Unzipping a network Key observa1on: we can move re1cula1ons up or down (un1l they hit a specia1on node) and the trees displayed by a network remain the same: a b c d e N Δ a b c d e
22 Canonical networks Unzipping a network Key observa1on: we can move re1cula1ons up or down (un1l they hit a specia1on node) and the trees displayed by a network remain the same: a b c d e N N*.2.05 Δ a b c d e a b c d e
23 Canonical networks Unzipping a network Key observa1on: we can move re1cula1ons up or down (un1l they hit a specia1on node) and the trees displayed by a network remain the same: N a b c d e N a b c d e 5 5 N*.2.05 Δ a b c d e a b c d e
24 Canonical networks Unzipping a network Key observa1on: we can move re1cula1ons up or down (un1l they hit a specia1on node) and the trees displayed by a network remain the same: N a b c d e N a b c d e By moving re1cula1ons always down, N 1 and N 2 both end up becoming the same network. 5 N* True in general: indis1nguishable networks always transform into the same network the canonical form of N 1 and N a b c d e
25 Canonical networks What it means for the evolu1onary biologist If N is reconstructed by a "classic" inference method, then even assuming perfect and unlimited data, the best you can hope is that the true phylogene1c network is just one of the many that are indis1nguishable from N a b c d e N a b c d e N* The canonical form of N is a unique representa1ve of the networks indis1nguishable from N, that excludes their unrecoverable aspects
26 Canonical networks Take home message for the computational biologist If N is reconstructed by a "classic" inference method, then even assuming perfect and unlimited data, the best you can hope is that the true phylogene1c network is just one of the many that are indis1nguishable from N Canonical networks Network space Classes of indistinguishable networks "Classic" network inference methods should only a^empt to reconstruct canonical networks.
27 Dealing with deep coalescence Are gene trees always displayed by the network? The approaches above assume that deep coalescence cannot occur, so gene trees are necessarily displayed trees, but this is not always the case Currently accepted introgression scenario among modern humans, Neanderthals and Denisovans, based on sequenced (ancient) nuclear genomes: Phylogene1c tree for the mitochondrial genome (Krause et al Nature 2010): 1.04 Myr (779 kyr 1.3 Myr) kyr ( kyr) Afr. Mbuti 2 2 Afr. Mbuti 3 Afr. Hausa 4 Afr. San 5 Afr. Ibo 2 6 Afr. San 2 7 Afr. Mbenzele 8 Afr. Biaka 2 9 Afr. Mbenzele 2 10 Afr. Biaka 11 Afr. Kikuyu 12 Afr. Ibo 13 Afr. Mandenka 14 Afr. Effik 15 Afr. Effik 2 16 Afr. Ewondo 17 Afr. Lisongo 18 Afr. Yoruba 19 Afr. Bamileke 20 Afr. Yoruba 21 Asia Evenki 22 Asia Khirgiz 23 Asia Buriat 24 Am. Warao 2 25 Am. Warao 26 New Guinea Coast 27 Aust. Aborigine 3 Modern 28 Asia Chinese 29 Asia Japanese humans 30 Asia Siberian 31 Asia Japanese 32 Am. Guarani 33 Asia Indian 34 Afr. Mkamba 35 Asia Chukchi 36 New Guinea Coast 2 37 New Guinea High 38 New Guinea High Eur. Italian 40 EMH Kostenki Asia Uzbek 42 Asia Korean 43 Polynesia Samoan 44 Am. Piman 45 Eur. German 46 Asia Georgian 47 Asia Chinese 48 Eur. English 49 Eur. Saami 50 Eur. French 51 Asia Crimean Tatar 52 Eur. Dutch 53 rcrs 54 Aust. Aborigine 55 Aust. Aborigine 2 56 Mezmaiskaya 57 Vi Feldhofer 2 59 Neanderthals Sidron Vi Feldhofer 1 62 Denisova Denisova
28 ILS An example of incomplete lineage sor1ng (ILS) (a) Population view (b) Reconciliation representation
29 ILS ILS and the probability of gene trees A B C A B C A B C a b c b a c Rannal & Yang 2003 (Genealogies) Degnan & Salter 2005 (Topologies) c a b
30 ILS ILS and the probability of gene trees A B C A B C A B C a b c b a c c a b
31 ILS ILS and the probability of gene trees a b c d a c d b b c a d b d c a a b c d a b d c a d b c b c d a c d a b a c b d a c b d a d c b b d a c c d b a a d b c
32 ILS Es1ma1ng the species tree under the coalescent model Since standard phylogene1c methods can be inconsistent under ILS (Degnan & Rosenberg 2006), new methods have been developed to cope for this bias, eg: STEM (Kubatko, Carstens & Knowles 2009) STAR (Liu, Yu, Pearl & Edwards 2009) MP-EST (Liu, Yu & Edwards 2010) They es1mate a phylogene1c species tree given a sample of gene trees (with branch lengths or not) under the coalescent model.
33 ILS Es1ma1ng the species tree under the coalescent model Since standard phylogene1c methods can be inconsistent under ILS (Degnan & Rosenberg 2006), new methods have been developed to cope for this bias, eg: STEM (Kubatko, Carstens & Knowles 2009) STAR (Liu, Yu, Pearl & Edwards 2009) MP-EST (Liu, Yu & Edwards 2010) The same can be done with networks! A B C Yu et al. PNAS 2014
34 Phylogene1c networks + ILS Phylogene1c network inference under ILS N a b c d e f Pr(A 1,A 2,...,A m N) = my my X Pr(A i N) = Pr(A i T )Pr(T N). i=1 i=1 T 2T (N) Y Z?
35 Phylogene1c networks + ILS Phylogene1c network inference under ILS N a b c d e f my my X Pr(A 1,A 2,...,A m N) = Pr(A i N) = Pr(A i T )Pr(T N). i=1 i=1 T 2T (N) my Z Pr(A 1,A 2,...,A m N) = i=1 Pr(A 1,A 2,...,A m N) = G Pr(A i )p(g N). Y m i=1 p(g i N).?
36 Phylogene1c networks + ILS Phylogene1c network inference under ILS Y Yu & L Nakhleh's group : Yu et al. BMC Bioinforma1cs 2013 (without branch lengths) Yu et al. PNAS 2014 (with branch lengths) Wen el al. PLOS Gene1cs 2016 (Bayesian method) PhyloNet h^p://bioinfo.cs.rice.edu/phylonet Y Y X But also other groups contributed significantly, e.g. L Kubabtko's group and C Ane s group. Y Z Pr(A 1,A 2,...,A m N) = my p(g i N). i=1
37 Dealing with deep coalescence Are gene trees always displayed by the network? The approaches above assume that deep coalescence cannot occur, so gene trees are necessarily displayed trees, but this is not always the case: Currently accepted introgression scenario among modern humans, Neanderthals and Denisovans, based on sequenced (ancient) nuclear genomes:
38 Dealing with deep coalescence Are gene trees always displayed by the network? The approaches above assume that deep coalescence cannot occur, so gene trees are necessarily displayed trees, but this is not always the case: Currently accepted introgression scenario among modern humans, Neanderthals and Denisovans, based on sequenced (ancient) nuclear genomes: Phylogene1c tree for the mitochondrial genome (Krause et al Nature 2010): 1.04 Myr (779 kyr 1.3 Myr) kyr ( kyr) Afr. Mbuti 2 2 Afr. Mbuti 3 Afr. Hausa 4 Afr. San 5 Afr. Ibo 2 6 Afr. San 2 7 Afr. Mbenzele 8 Afr. Biaka 2 9 Afr. Mbenzele 2 10 Afr. Biaka 11 Afr. Kikuyu 12 Afr. Ibo 13 Afr. Mandenka 14 Afr. Effik 15 Afr. Effik 2 16 Afr. Ewondo 17 Afr. Lisongo 18 Afr. Yoruba 19 Afr. Bamileke 20 Afr. Yoruba 21 Asia Evenki 22 Asia Khirgiz 23 Asia Buriat 24 Am. Warao 2 25 Am. Warao 26 New Guinea Coast 27 Aust. Aborigine 3 Modern 28 Asia Chinese 29 Asia Japanese humans 30 Asia Siberian 31 Asia Japanese 32 Am. Guarani 33 Asia Indian 34 Afr. Mkamba 35 Asia Chukchi 36 New Guinea Coast 2 37 New Guinea High 38 New Guinea High Eur. Italian 40 EMH Kostenki Asia Uzbek 42 Asia Korean 43 Polynesia Samoan 44 Am. Piman 45 Eur. German 46 Asia Georgian 47 Asia Chinese 48 Eur. English 49 Eur. Saami 50 Eur. French 51 Asia Crimean Tatar 52 Eur. Dutch 53 rcrs 54 Aust. Aborigine 55 Aust. Aborigine 2 56 Mezmaiskaya 57 Vi Feldhofer 2 59 Neanderthals Sidron Vi Feldhofer 1 62 Denisova Denisova
39 Dealing with deep coalescence Deep coalescence (ILS) can solve indis1nguishability The following two scenarios would be indis1nguishable on the basis of the trees they display: A B C A B C * *
40 Dealing with deep coalescence Deep coalescence (ILS) can solve indis1nguishability The following two scenarios would be indis1nguishable on the basis of the trees they display: Imagine sampling two alleles from the hybrid species * In absence of ILS the only observable gene trees are: A B C A B C * *
41 Dealing with deep coalescence Deep coalescence (ILS) can solve indis1nguishability The following two scenarios would be indis1nguishable on the basis of the trees they display: Imagine sampling two alleles from the hybrid species * If the two alleles from * do not coalesce, other three gene trees can be observed: A B C A B C * *
42 Dealing with deep coalescence Deep coalescence (ILS) can solve indis1nguishability The following two scenarios would be indis1nguishable on the basis of the trees they display: Imagine sampling two alleles If this is the from the hybrid species * true network If the two alleles from * do not coalesce, other three gene trees can be observed: A B C A B C * * mostly these
43 Dealing with deep coalescence Deep coalescence (ILS) can solve indis1nguishability The following two scenarios would be indis1nguishable on the basis of the trees they display: Imagine sampling two alleles If this is the from the hybrid species * true network If the two alleles from * do not coalesce, other three gene trees can be observed: A B C A B C * * mostly these
44 Dealing with deep coalescence Network Mul1-Species Coalescent (NMSC) Given a network with branch lengths (in coalescent units) and inheritance probabili1es, we can calculate the probability of every possible gene tree: Pr = ¼ Pr = ⅛ Pr = ⅛ Pr = ¼ Pr = ¼ ½ ½ ½ ½ 0 A * B C Pr = 0
45 Dealing with deep coalescence Network Mul1-Species Coalescent (NMSC) Given a network with branch lengths (in coalescent units) and inheritance probabili1es, we can calculate the probability of every possible gene tree: Pr = ¼ Pr = ⅛ Pr = ⅛ Pr = ¼ Pr = ¼ ½ ½ ½ ½ 0 A * B C The probability of a gene tree T is a weighted average of the probabili1es of T under a number of "parental trees" (14 here)
46 Dealing with deep coalescence Network Mul1-Species Coalescent (NMSC) Given a network with branch lengths (in coalescent units) and inheritance probabili1es, we can calculate the probability of every possible gene tree: Pr = ¼ Pr = ⅛ Pr = ⅛ Pr = ¼ Pr = ¼ ½ ½ ½ ½ 0 A * B C This may solve the iden1fiability issues for several prac1cal cases. Note that gene trees can fail to be displayed by the network also because of allopolyploidy.
47 Problems Some issues Searching the space of phylogene1c networks The space of networks with k re1cula1ons is infinite. Controlling for Model Complexity Because any network with k re1cula1ons provides a more complex model than any network with (k-1) re1cula1ons, we must handle the model selec1on problem (AIC, BIC, K-fold cross-valida1on, ). Iden1fiability issues Not accoun1ng for ILS and allopolyploidy
48 h^p://phylnet.univ-mlv.fr/ h^p://phylonetworks.blogspot.fr THANKS FOR YOUR ATTENTION
LETTERS. ; The complete mitochondrial DNA genome of an unknown hominin from southern Siberia
doi:.38/nature8976 ; The complete mitochondrial DNA genome of an unknown hominin from southern Siberia Johannes Krause, Qiaomei Fu, Jeffrey M. Good 2, Bence Viola,3, Michael V. Shunkov 4, Anatoli P. Derevianko
More informationToday's project. Test input data Six alignments (from six independent markers) of Curcuma species
DNA sequences II Analyses of multiple sequence data datasets, incongruence tests, gene trees vs. species tree reconstruction, networks, detection of hybrid species DNA sequences II Test of congruence of
More informationPhylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki
Phylogene)cs IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, 2016 Joyce Nzioki Phylogenetics The study of evolutionary relatedness of organisms. Derived from two Greek words:» Phle/Phylon: Tribe/Race» Genetikos:
More informationMaximum Likelihood Inference of Reticulate Evolutionary Histories
Maximum Likelihood Inference of Reticulate Evolutionary Histories Luay Nakhleh Department of Computer Science Rice University The 2015 Phylogenomics Symposium and Software School The University of Michigan,
More informationMul$ple Sequence Alignment Methods. Tandy Warnow Departments of Bioengineering and Computer Science h?p://tandy.cs.illinois.edu
Mul$ple Sequence Alignment Methods Tandy Warnow Departments of Bioengineering and Computer Science h?p://tandy.cs.illinois.edu Species Tree Orangutan Gorilla Chimpanzee Human From the Tree of the Life
More informationAnatomy of a species tree
Anatomy of a species tree T 1 Size of current and ancestral Populations (N) N Confidence in branches of species tree t/2n = 1 coalescent unit T 2 Branch lengths and divergence times of species & populations
More informationWenEtAl-biorxiv 2017/12/21 10:55 page 2 #2
WenEtAl-biorxiv 0// 0: page # Inferring Phylogenetic Networks Using PhyloNet Dingqiao Wen, Yun Yu, Jiafan Zhu, Luay Nakhleh,, Computer Science, Rice University, Houston, TX, USA; BioSciences, Rice University,
More informationCoalescent Histories on Phylogenetic Networks and Detection of Hybridization Despite Incomplete Lineage Sorting
Syst. Biol. 60(2):138 149, 2011 c The Author(s) 2011. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com
More informationASTRAL: Fast coalescent-based computation of the species tree topology, branch lengths, and local branch support
ASTRAL: Fast coalescent-based computation of the species tree topology, branch lengths, and local branch support Siavash Mirarab University of California, San Diego Joint work with Tandy Warnow Erfan Sayyari
More informationPhyloNet. Yun Yu. Department of Computer Science Bioinformatics Group Rice University
PhyloNet Yun Yu Department of Computer Science Bioinformatics Group Rice University yy9@rice.edu Symposium And Software School 2016 The University Of Texas At Austin Installation System requirement: Java
More informationSTEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization)
STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization) Laura Salter Kubatko Departments of Statistics and Evolution, Ecology, and Organismal Biology The Ohio State University kubatko.2@osu.edu
More informationMitochondrial DNA Clocks Imply Linear Speciation Rates Within Kinds
Answers Research Journal 8 (215):273 34. www.answersingenesis.org/arj/v8/mitochondrial_dna_clocks_linear_speciation_rates.pdf Mitochondrial DNA Clocks Imply Linear s Within Kinds Nathaniel T. Jeanson,
More informationNon-binary Tree Reconciliation. Louxin Zhang Department of Mathematics National University of Singapore
Non-binary Tree Reconciliation Louxin Zhang Department of Mathematics National University of Singapore matzlx@nus.edu.sg Introduction: Gene Duplication Inference Consider a duplication gene family G Species
More informationfirst (i.e., weaker) sense of the term, using a variety of algorithmic approaches. For example, some methods (e.g., *BEAST 20) co-estimate gene trees
Concatenation Analyses in the Presence of Incomplete Lineage Sorting May 22, 2015 Tree of Life Tandy Warnow Warnow T. Concatenation Analyses in the Presence of Incomplete Lineage Sorting.. 2015 May 22.
More informationQuartet Inference from SNP Data Under the Coalescent Model
Bioinformatics Advance Access published August 7, 2014 Quartet Inference from SNP Data Under the Coalescent Model Julia Chifman 1 and Laura Kubatko 2,3 1 Department of Cancer Biology, Wake Forest School
More informationNJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees
NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana
More informationEstimating Evolutionary Trees. Phylogenetic Methods
Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent
More informationFrom Gene Trees to Species Trees. Tandy Warnow The University of Texas at Aus<n
From Gene Trees to Species Trees Tandy Warnow The University of Texas at Aus
More informationTaming the Beast Workshop
Workshop and Chi Zhang June 28, 2016 1 / 19 Species tree Species tree the phylogeny representing the relationships among a group of species Figure adapted from [Rogers and Gibbs, 2014] Gene tree the phylogeny
More informationEvolu&on, Popula&on Gene&cs, and Natural Selec&on Computa.onal Genomics Seyoung Kim
Evolu&on, Popula&on Gene&cs, and Natural Selec&on 02-710 Computa.onal Genomics Seyoung Kim Phylogeny of Mammals Phylogene&cs vs. Popula&on Gene&cs Phylogene.cs Assumes a single correct species phylogeny
More informationUpcoming challenges in phylogenomics. Siavash Mirarab University of California, San Diego
Upcoming challenges in phylogenomics Siavash Mirarab University of California, San Diego Gene tree discordance The species tree gene1000 Causes of gene tree discordance include: Incomplete Lineage Sorting
More informationCSCI1950 Z Computa4onal Methods for Biology Lecture 5
CSCI1950 Z Computa4onal Methods for Biology Lecture 5 Ben Raphael February 6, 2009 hip://cs.brown.edu/courses/csci1950 z/ Alignment vs. Distance Matrix Mouse: ACAGTGACGCCACACACGT Gorilla: CCTGCGACGTAACAAACGC
More informationSelection against A 2 (upper row, s = 0.004) and for A 2 (lower row, s = ) N = 25 N = 250 N = 2500
Why mitochondrial DN is simple Mitochondrial DN and the History of Population Size lan R Rogers January 4, 8 inherited from mother only no recombination evolves fast mean pairwise difference: average number
More informationThe Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection
The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection Yun Yu 1, James H. Degnan 2,3, Luay Nakhleh 1 * 1 Department of Computer Science, Rice
More informationIn comparisons of genomic sequences from multiple species, Challenges in Species Tree Estimation Under the Multispecies Coalescent Model REVIEW
REVIEW Challenges in Species Tree Estimation Under the Multispecies Coalescent Model Bo Xu* and Ziheng Yang*,,1 *Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China and Department
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the
More informationCSCI1950 Z Computa3onal Methods for Biology* (*Working Title) Lecture 1. Ben Raphael January 21, Course Par3culars
CSCI1950 Z Computa3onal Methods for Biology* (*Working Title) Lecture 1 Ben Raphael January 21, 2009 Course Par3culars Three major topics 1. Phylogeny: ~50% lectures 2. Func3onal Genomics: ~25% lectures
More informationImproved maximum parsimony models for phylogenetic networks
Improved maximum parsimony models for phylogenetic networks Leo van Iersel Mark Jones Celine Scornavacca December 20, 207 Abstract Phylogenetic networks are well suited to represent evolutionary histories
More informationPoint of View. Why Concatenation Fails Near the Anomaly Zone
Point of View Syst. Biol. 67():58 69, 208 The Author(s) 207. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email:
More informationWorkshop III: Evolutionary Genomics
Identifying Species Trees from Gene Trees Elizabeth S. Allman University of Alaska IPAM Los Angeles, CA November 17, 2011 Workshop III: Evolutionary Genomics Collaborators The work in today s talk is joint
More informationFast coalescent-based branch support using local quartet frequencies
Fast coalescent-based branch support using local quartet frequencies Molecular Biology and Evolution (2016) 33 (7): 1654 68 Erfan Sayyari, Siavash Mirarab University of California, San Diego (ECE) anzee
More informationIncomplete Lineage Sorting: Consistent Phylogeny Estimation From Multiple Loci
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1-2010 Incomplete Lineage Sorting: Consistent Phylogeny Estimation From Multiple Loci Elchanan Mossel University of
More informationDenisova cave. h,ps://
Denisova cave h,ps://www.flickr.com/photos/hmnh/3033749380/ Gene tree- species tree conflict can result from introgression or incomplete lineage sorcng Sous & Hey 2013 DisCnguishing incomplete lineage sorcng
More informationProperties of Consensus Methods for Inferring Species Trees from Gene Trees
Syst. Biol. 58(1):35 54, 2009 Copyright c Society of Systematic Biologists DOI:10.1093/sysbio/syp008 Properties of Consensus Methods for Inferring Species Trees from Gene Trees JAMES H. DEGNAN 1,4,,MICHAEL
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationDr. Amira A. AL-Hosary
Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological
More informationIntegrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2012 University of California, Berkeley
Integrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2012 University of California, Berkeley B.D. Mishler April 12, 2012. Phylogenetic trees IX: Below the "species level;" phylogeography; dealing
More informationAn introduction to phylogenetic networks
An introduction to phylogenetic networks Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University Email: steven.kelk@maastrichtuniversity.nl Web: http://skelk.sdf-eu.org Genome sequence,
More informationLevel 3 Biology, 2014
91606 916060 3SUPERVISOR S Level 3 Biology, 2014 91606 Demonstrate understanding of trends in human evolution 9.30 am Thursday 13 November 2014 Credits: Four Achievement Achievement with Merit Achievement
More informationCS 581 / BIOE 540: Algorithmic Computa<onal Genomics. Tandy Warnow Departments of Bioengineering and Computer Science hhp://tandy.cs.illinois.
CS 581 / BIOE 540: Algorithmic Computa
More informationPhylogenetics: Building Phylogenetic Trees
1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should
More informationAmira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationCSCI1950 Z Computa4onal Methods for Biology Lecture 4. Ben Raphael February 2, hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary
CSCI1950 Z Computa4onal Methods for Biology Lecture 4 Ben Raphael February 2, 2009 hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary Parsimony Probabilis4c Method Input Output Sankoff s & Fitch
More informationPhylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University
Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary
More informationPhylogenetics in the Age of Genomics: Prospects and Challenges
Phylogenetics in the Age of Genomics: Prospects and Challenges Antonis Rokas Department of Biological Sciences, Vanderbilt University http://as.vanderbilt.edu/rokaslab http://pubmed2wordle.appspot.com/
More informationPhylogeny and Molecular Evolution. Introduction
Phylogeny and Molecular Evolution Introduction 1 2/62 3/62 Credit Serafim Batzoglou (UPGMA slides) http://www.stanford.edu/class/cs262/slides Notes by Nir Friedman, Dan Geiger, Shlomo Moran, Ron Shamir,
More informationSupporting Information
Supporting Information Hammer et al. 10.1073/pnas.1109300108 SI Materials and Methods Two-Population Model. Estimating demographic parameters. For each pair of sub-saharan African populations we consider
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationInferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting
arxiv:1509.06075v3 [q-bio.pe] 12 Feb 2016 Inferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting Claudia Solís-Lemus 1 and Cécile Ané 1,2 1 Department of Statistics,
More informationConsistency Index (CI)
Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)
More informationBINF6201/8201. Molecular phylogenetic methods
BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics
More informationComparative Genomics II
Comparative Genomics II Advances in Bioinformatics and Genomics GEN 240B Jason Stajich May 19 Comparative Genomics II Slide 1/31 Outline Introduction Gene Families Pairwise Methods Phylogenetic Methods
More information394C, October 2, Topics: Mul9ple Sequence Alignment Es9ma9ng Species Trees from Gene Trees
394C, October 2, 2013 Topics: Mul9ple Sequence Alignment Es9ma9ng Species Trees from Gene Trees Mul9ple Sequence Alignment Mul9ple Sequence Alignments and Evolu9onary Histories (the meaning of homologous
More informationThe Mathema)cs of Es)ma)ng the Tree of Life. Tandy Warnow The University of Illinois
The Mathema)cs of Es)ma)ng the Tree of Life Tandy Warnow The University of Illinois Phylogeny (evolu9onary tree) Orangutan Gorilla Chimpanzee Human From the Tree of the Life Website, University of Arizona
More informationPhylogenetic Networks, Trees, and Clusters
Phylogenetic Networks, Trees, and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science Rice University Houston, TX 77005, USA nakhleh@cs.rice.edu 2 Department of Biology University
More informationProperties of normal phylogenetic networks
Properties of normal phylogenetic networks Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu August 13, 2009 Abstract. A phylogenetic network is
More informationBayesian Models for Phylogenetic Trees
Bayesian Models for Phylogenetic Trees Clarence Leung* 1 1 McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada ABSTRACT Introduction: Inferring genetic ancestry of different species
More informationPhylogenomics of closely related species and individuals
Phylogenomics of closely related species and individuals Matthew Rasmussen Siepel lab, Cornell University In collaboration with Manolis Kellis, MIT CSAIL February, 2013 Short time scales 1kyr-1myrs Long
More informationHuman Evolution
http://www.pwasoh.com.co Human Evolution Cantius, ca 55 mya The continent-hopping habits of early primates have long puzzled scientists, and several scenarios have been proposed to explain how the first
More informationHuman Evolution. Darwinius masillae. Ida Primate fossil from. in Germany Ca.47 M years old. Cantius, ca 55 mya
http://www.pwasoh.com Human Evolution Cantius, ca 55 mya The continent-hopping habits of early primates have long puzzled scientists, and several scenarios have been proposed to explain how the first true
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationIs the equal branch length model a parsimony model?
Table 1: n approximation of the probability of data patterns on the tree shown in figure?? made by dropping terms that do not have the minimal exponent for p. Terms that were dropped are shown in red;
More informationInference of Parsimonious Species Phylogenies from Multi-locus Data
RICE UNIVERSITY Inference of Parsimonious Species Phylogenies from Multi-locus Data by Cuong V. Than A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE Doctor of Philosophy APPROVED,
More informationEfficient Bayesian species tree inference under the multi-species coalescent
Efficient Bayesian species tree inference under the multi-species coalescent arxiv:1512.03843v1 [q-bio.pe] 11 Dec 2015 Bruce Rannala 1 and Ziheng Yang 2 1 Department of Evolution & Ecology, University
More informationInferring Molecular Phylogeny
Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction
More informationPreliminaries. Download PAUP* from: Tuesday, July 19, 16
Preliminaries Download PAUP* from: http://people.sc.fsu.edu/~dswofford/paup_test 1 A model of the Boston T System 1 Idea from Paul Lewis A simpler model? 2 Why do models matter? Model-based methods including
More informationThe Inference of Gene Trees with Species Trees
Systematic Biology Advance Access published October 28, 2014 Syst. Biol. 0(0):1 21, 2014 The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. This
More informationFrom Genes to Genomes and Beyond: a Computational Approach to Evolutionary Analysis. Kevin J. Liu, Ph.D. Rice University Dept. of Computer Science
From Genes to Genomes and Beyond: a Computational Approach to Evolutionary Analysis Kevin J. Liu, Ph.D. Rice University Dept. of Computer Science!1 Adapted from U.S. Department of Energy Genomic Science
More informationRegular networks are determined by their trees
Regular networks are determined by their trees Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu February 17, 2009 Abstract. A rooted acyclic digraph
More informationTheory of Evolution Charles Darwin
Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationthat of Phylotree.org, mtdna tree Build 1756 (Supplementary TableS2). is resulted in 78 individuals allocated to the hg B4a1a1 and three individuals to hg Q. e control region (nps 57372 and nps 1602416526)
More informationMany of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationUnderstanding How Stochasticity Impacts Reconstructions of Recent Species Divergent History. Huateng Huang
Understanding How Stochasticity Impacts Reconstructions of Recent Species Divergent History by Huateng Huang A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor
More informationSpecies Tree Inference by Minimizing Deep Coalescences
Species Tree Inference by Minimizing Deep Coalescences Cuong Than, Luay Nakhleh* Department of Computer Science, Rice University, Houston, Texas, United States of America Abstract In a 1997 seminal paper,
More informationEstimating Species Phylogeny from Gene-Tree Probabilities Despite Incomplete Lineage Sorting: An Example from Melanoplus Grasshoppers
Syst. Biol. 56(3):400-411, 2007 Copyright Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150701405560 Estimating Species Phylogeny from Gene-Tree Probabilities
More informationGenetic history of an archaic hominin group from Denisova Cave in Siberia
ARTICLE doi:10.1038/nature09710 Genetic history of an archaic hominin group from Denisova Cave in Siberia David Reich 1,2 *, Richard E. Green 3,4 *, Martin Kircher 3 *, Johannes Krause 3,5 *, Nick Patterson
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationLeber s Hereditary Optic Neuropathy
Leber s Hereditary Optic Neuropathy Dear Editor: It is well known that the majority of Leber s hereditary optic neuropathy (LHON) cases was caused by 3 mtdna primary mutations (m.3460g A, m.11778g A, and
More informationDNA-based species delimitation
DNA-based species delimitation Phylogenetic species concept based on tree topologies Ø How to set species boundaries? Ø Automatic species delimitation? druhů? DNA barcoding Species boundaries recognized
More informationElements of Bioinformatics 14F01 TP5 -Phylogenetic analysis
Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis 10 December 2012 - Corrections - Exercise 1 Non-vertebrate chordates generally possess 2 homologs, vertebrates 3 or more gene copies; a Drosophila
More informationEfficient Bayesian Species Tree Inference under the Multispecies Coalescent
Syst. Biol. 66(5):823 842, 2017 The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com
More informationGenetic Drift in Human Evolution
Genetic Drift in Human Evolution (Part 2 of 2) 1 Ecology and Evolutionary Biology Center for Computational Molecular Biology Brown University Outline Introduction to genetic drift Modeling genetic drift
More informationDetection and Polarization of Introgression in a Five-Taxon Phylogeny
Syst. Biol. 64(4):651 662, 2015 The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com
More informationA Phylogenetic Network Construction due to Constrained Recombination
A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer
More informationGene Regulatory Networks II Computa.onal Genomics Seyoung Kim
Gene Regulatory Networks II 02-710 Computa.onal Genomics Seyoung Kim Goal: Discover Structure and Func;on of Complex systems in the Cell Identify the different regulators and their target genes that are
More informationTo link to this article: DOI: / URL:
This article was downloaded by:[ohio State University Libraries] [Ohio State University Libraries] On: 22 February 2007 Access Details: [subscription number 731699053] Publisher: Taylor & Francis Informa
More informationPhylogenetics: Parsimony
1 Phylogenetics: Parsimony COMP 571 Luay Nakhleh, Rice University he Problem 2 Input: Multiple alignment of a set S of sequences Output: ree leaf-labeled with S Assumptions Characters are mutually independent
More informationChapter 26 Phylogeny and the Tree of Life
Chapter 26 Phylogeny and the Tree of Life Chapter focus Shifting from the process of how evolution works to the pattern evolution produces over time. Phylogeny Phylon = tribe, geny = genesis or origin
More informationSupplementary Materials: Efficient moment-based inference of admixture parameters and sources of gene flow
Supplementary Materials: Efficient moment-based inference of admixture parameters and sources of gene flow Mark Lipson, Po-Ru Loh, Alex Levin, David Reich, Nick Patterson, and Bonnie Berger 41 Surui Karitiana
More informationExploring phylogenetic hypotheses via Gibbs sampling on evolutionary networks
The Author(s) BMC Genomics 2016, 17(Suppl 10):784 DOI 10.1186/s12864-016-3099-y RESEARCH Open Access Exploring phylogenetic hypotheses via Gibbs sampling on evolutionary networks Yun Yu 1, Christopher
More informationC3020 Molecular Evolution. Exercises #3: Phylogenetics
C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from
More informationRobust demographic inference from genomic and SNP data
Robust demographic inference from genomic and SNP data Laurent Excoffier Isabelle Duperret, Emilia Huerta-Sanchez, Matthieu Foll, Vitor Sousa, Isabel Alves Computational and Molecular Population Genetics
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley B.D. Mishler March 31, 2011. Reticulation,"Phylogeography," and Population Biology:
More informationWhat is Phylogenetics
What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)
More informationPosterior predictive checks of coalescent models: P2C2M, an R package
Molecular Ecology Resources (2016) 16, 193 205 doi: 10.1111/1755-0998.12435 Posterior predictive checks of coalescent models: P2C2M, an R package MICHAEL GRUENSTAEUDL,* NOAH M. REID, GREGORY L. WHEELER*
More informationHappy New Year Homo erectus? More evidence for interbreeding with archaics predating the modern human/neanderthal split.
Happy New Year Homo erectus? More evidence for interbreeding with archaics predating the modern human/neanderthal split. Peter J. Waddell 1 pwaddell.new@gmail.com 1 Ronin Institute, 19 Flowermound Dr,
More informationMolecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016
Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,
More informationConcepts and Methods in Molecular Divergence Time Estimation
Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks
More information