Comparative Methods on Phylogenetic Networks Claudia Solís-Lemus Emory University Joint Statistical Meetings August 1, 2018
Phylogenetic What? Networks Why? Part I Part II How? When? PCM
What? Phylogenetic network Gene flow
What? Phylogenetic network 17% Hahn et al (2016) Explicit Implicit
Why? Phylogenetic network Main tree history Gene flow as noise
Why? Phylogenetic network Bootstrap support Mean Boostrap support 100 80 60 40 20 0 100 80 60 40 20 Concatenation γ = 0.1 10 20 50 100 200 500 1k γ = 0.3 100 80 60 40 20 0 100 80 60 40 20 Coalescent tree methods not robust to gene flow 34 1256 123 456 or 124 356 other ASTRAL γ = 0.1 10 20 50 100 200 500 1k γ = 0.3 100 80 60 40 20 0 100 80 60 40 20 NJst γ = 0.1 10 20 50 100 200 500 1k γ = 0.3 1.0 2.0 0.1 0.1 1.0 0.0 2.1 1 2 3 4 5 6 White: true tree 0 0 10 20 50 100 200 500 1k 10 20 50 100 200 500 1k Number of Genes Number of genes 0 10 20 50 100 200 500 1k (Solís-Lemus, Yang, Ané, 2016, Syst Bio)
How? Phylogenetic network
How? Phylogenetic network Estimate gene trees MrBayes (Huelsenbeck, Ronquist, 2001) RAxML (Stamatakis, 2014)
How? Phylogenetic network Data
How? Phylogenetic network Maximum pseudolikelihood Data (Solís-Lemus, Ané, 2016, PLoS Genetics) (Solís-Lemus et al, 2017, MBE) www.github.com/crsl4/phylonetworks
When? Phylogenetic network Goodness-of-fit test TICR Data? https://github.com/nstenz/ticr (Stenz et al, 2015, Syst Bio)
1183 genes, 24 swordtails and platyfish Xiphophorus fish data (Cui et al., 2013) γ=0.17 100 14 75 97 13 20 γ=0.19 X. hellerii X. alvarezi X. mayae X. signum X. clemenciae X. monticolus X. maculatus X. andersi X. milleri X. evelynae X. variatus X. couchianus X. gordoni X. meyeri X. xiphidium X. continens X. pygmaeus X. nigrensis X. multilineatus X. nezahuacoyotl X. montezumae X. birchmanni X. malinche X. cortezi SS SP NP NS (Solís-Lemus, Ané, 2016, PLoS Genetics)
Sword index Female preference Xiphophorus fish data (Cui et al., 2013) γ=0.17 100 14 75 97 13 20 γ=0.19 X. hellerii X. alvarezi X. mayae X. signum X. clemenciae X. monticolus X. maculatus X. andersi X. milleri X. evelynae X. variatus X. couchianus X. gordoni X. meyeri X. xiphidium X. continens X. pygmaeus X. nigrensis X. multilineatus X. nezahuacoyotl X. montezumae X. birchmanni X. malinche X. cortezi SS SP NP NS (Solís-Lemus, Ané, 2016, PLoS Genetics)
Trait models of evolution in networks l 4 X 4 γ 6b = 0.7 X 1 Hybrid Branch : γ 6a = 0.3 Tree Branch : γ 2 = 1 X 5 X 2 X 3 X 6 X 7 X 8 X 9 X 10 X 11 X 12 X 13 phenotype 4 2 0 2 4 6 X 1 X 2 X 3 X 13 X 5 X 11 γ 6a X γ 6 6b X 12 X 4 X 10 X 7 X 8 X 9 200 150 100 50 0 time 200 150 100 50 0 time Brownian Motion + weighted average in hybrid X h = 1 X + p1 2 X p2 (Bastide et al, 2018, Syst Bio) Extends pedigree model for polygenic traits (Thomson, 2000) Similar model for allele frequencies (Pickrell & Pritchard, 2012)
Trait models of evolution in networks l 4 X 4 γ 6b = 0.7 X 1 Hybrid Branch : γ 6a = 0.3 Tree Branch : γ 2 = 1 X 5 X 2 X 3 X 6 X 7 X 8 X 9 X 10 X 11 X 12 X 13 phenotype 4 2 0 2 4 6 X 1 X 2 X 3 X 13 X 5 X 11 γ 6a X γ 6 6b X 12 X 4 X 10 X 7 X 8 X 9 200 150 100 50 0 time 200 150 100 50 0 time Brownian Motion X N(X root, 2 V) + weighted average in hybrid X h = 1 X + p1 2 X p2 (Bastide et al, 2018, Syst Bio) Phylogenetic signal Ancestral reconstruction Phylogenetic regression Phylogenetic ANOVA
Trait models of evolution in networks l 4 X 4 γ 6b = 0.7 X 1 Hybrid Branch : γ 6a = 0.3 Tree Branch : γ 2 = 1 X 5 X 2 X 3 X 6 X 7 X 8 X 9 X 10 X 11 X 12 X 13 phenotype 4 2 0 2 4 6 X 1 X 2 X 3 X 13 X 5 X 11 γ 6a X γ 6 6b X 12 X 4 X 10 X 7 X 8 X 9 200 150 100 50 0 time 200 150 100 50 0 time V ij = tij P p i path to i Pp P j path to j p i p t j e2p i \p j e if tree if network X N(X root, 2 V) (Bastide et al, 2018, Syst Bio)
Sword index Female preference Ancestral reconstruction: common ancestor likely had sword Phylogenetic regression: positive association between sword index and female preference but not significant (p = 0.106) Y i = 0.32 0.46 0.24 0.41 0.24 0.39 0.25 0.42 0.39 0.5 X i + i where N(0, 0.23 0.33 0.24 0.31 0.23 0.33 0.41 0.52 2 V) 0.62 0.61 0.62 0.25 0.3 0.2 0.32-0.03 0.48-0.03 0.48 0.61 0.62 0.63 0.64 0.61-0.12 0.41-0.06 0.37-0.06 0.37 0.25 0.28 0.25 0.28-0.11 0.42-0.04 0.32 Preference Sword Index 0.24 0.24* 0.28 0.35 X. maculatus X. andersi 0.23* 0.28 X. milleri 0.25* 0.25* 0.25* 0.28 0.24* 0.2* 0.28 0.28 0.28 0.28 0.28 0.3 X. gordoni X. meyeri X. couchianus X. variatus X. evelynae X. xiphidium -0.1-0.08-0.02-0.04* -0.24-0.33 0.37 0.4 0.3 0.3 0.37 0.28 X. nigrensis X. multilineatus X. pygmaeus X. continens X. malinche X. birchmanni -0.12* 0.37 X. cortezi 0.19 1.03 X. montezumae 0.44 0.41* 0.62* 0.91 0.62* 0.62* 0.56 0.52 0.6 0.64 0.65 0.7 X. clemenciae X. monticolus X. signum X. hellerii X. alvarezi X. mayae
Test for transgressive evolution X h = 1 X p1 + 2 X p2 + h l 4 X 4 γ 6b = 0.7 X 1 Hybrid Branch : γ 6a = 0.3 Tree Branch : γ 2 = 1 X 5 X 2 X 3 X 6 X 7 X 8 X 9 X 10 X 11 X 12 X 13 phenotype 4 2 0 2 4 6 X 1 X 2 X 3 X 13 X 5 X 11 γ 6a X γ 6 6b X 12 X 4 X 10 X 7 X 8 X 9 200 150 100 50 0 200 150 100 50 0 time time h =0 h = h No transgressive evolution Single-effect transgressive evolution Multi-effect transgressive evolution F tests Hybrid value: shift from parents range
Test for transgressive evolution X h = 1 X p1 + 2 X p2 + h l 4 X 4 γ 6b = 0.7 X 1 Hybrid Branch : γ 6a = 0.3 Tree Branch : γ 2 = 1 X 5 X 2 X 3 X 6 X 7 X 8 X 9 X 10 X 11 X 12 X 13 phenotype 4 2 0 2 4 6 X 1 X 2 X 3 X 13 X 5 X 11 γ 6a X γ 6 6b X 12 X 4 X 10 X 7 X 8 X 9 200 150 100 50 0 time Sword index: no evidence Female preference: evidence for heterogeneous (p = 0.009) 200 150 100 50 0 time Hybrid value: shift from parents range
Step-by-step tutorial Online documentation Google user group www.github.com/crsl4/phylonetworks http://crsl4.github.io/ (Solís-Lemus et al, 2017, MBE)
Acknowledgements Cécile Ané Bret Larget Douglas Bates Paul Bastide Ricardo Kriebel Will Sparks David Baum Mengyao Yang John Malloy John Spaw Noah Stenz Nan Ji Jordan Vonderwell Josh McGrath http://crsl4.github.io/ csolisl@emory.edu
Anomalous unrooted gene trees with gene flow Frequency among gene trees Quartet =0.0 =0.1 =0.3 AB CD 0.347 0.298 0.260 CA BD 0.327 0.351 0.370 CB AD 0.327 0.351 0.370 t 1 = t 2 =0.01,t 3 = t 4 = t 5 =1 ILS: no AUGT on 4 taxa (Degnan, 2013) ILS+HGT: AUGT on 4 taxa (Solís-Lemus, Yang, Ané, 2016, Syst Bio)
Anomaly zone with gene flow 0.45 match species tree conflict species tree 0.40 0.35 0.30 0.25 0 0.212 0.5 0.876 1 γ (Solís-Lemus, Yang, Ané, 2016, Syst Bio)
SNaQ performance Good diamond 7.4 6 6 1.5 2.0 2.0 0.5 1.2 0.5 0.9 0.7 1.3 1.1 3.9 3.4 1.8 0.7 1.4 5 4 1 2 3 5 4 3 2 1 Bad diamond (Solís-Lemus, Ané, 2016, PLoS Genetics)
Idea of proof of identifiability: hybridization System of equations {CFnetwork} System of equations {CFtree}
Idea of proof of identifiability: hybridization Solution to CFnetwork = CFtree if