STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization)
|
|
- Hortense Carr
- 5 years ago
- Views:
Transcription
1 STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization) Laura Salter Kubatko Departments of Statistics and Evolution, Ecology, and Organismal Biology The Ohio State University June 7, 2013
2 What is STEM-hy? Assumptions and Methods Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Background: STEM s Hybrid Species Models
3 Assumptions and Methods What is STEM-hy? STEM-hy is a program to perform maximum likelihood analysis for estimation of the species tree from multilocus data under the coalescent process. It includes the capability of evaluating hybrid taxa. Basic functions: Return the ML species tree. Search the space of all species trees and return the k trees with the highest likelihoods found. Compute the likelihood of a user-specified tree with branch lengths. Find optimal branch lengths on a user-specified tree. Carry out a bootstrap analysis to obtain bootstrap support values for nodes in the species tree. Evaluate hypotheses of hybridization in a model selection framework.
4 Assumptions and Methods Assumptions No recombination within loci Free recombination between loci No gene flow following speciation Only source of variability in single-gene histories is due to the coalescence process There is a single θ for the entire tree, for each locus Evolutionary rates may vary across loci
5 Assumptions and Methods Methods: ML Estimate of the Species Tree Liu et al. (2009) showed that the ML estimate of the species tree can be computed by sequentially clustering minimum observed divergence times between pairs of species across genes. They have shown that when gene trees are known without error, the ML species tree is a consistent estimator. A similar result was obtained by Roch & Mossel (2010) they call their estimator the GLASS tree (an acronym for Global LAteSt Split, based on the algorithm they developed to compute it). STEM computes the ML estimate of the species tree this way.
6 Assumptions and Methods Methods: Estimation of ML Times for an Arbitrary Species Tree The results of Liu et al. (2009) can be extended to derive the ML estimates of the speciation times for an arbitrary species tree. Thus, the likelihood of any species tree can be readily computed by using this result to obtain ML branch lengths. This is important in that it allows us to compare alternative phylogenetic hypotheses.
7 Assumptions and Methods Methods: Searching Species Tree Space for Trees of High Likelihood A simulated annealing algorithm is used to search the space of all species trees for trees that have high likelihoods. The k best trees found during the search are saved and printed to a file (k is set by the user). Exploration of the likelihood surface is particularly important for many of these problems. The details of the simulated annealing algorithm are similar to those given in Salter & Pearl (2001).
8 Assumptions and Methods Features of STEM-hy No limits (that I know of) on the number of taxa or the number of loci. Can handle intraspecific sampling. Allows information concerning mutation rate for each locus to be used in the analysis. Can handle different taxon samples across genes. Version 1.1 is written in Java (using Clojure).
9 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Data Preparation - Gene Trees STEM-hy takes as its input one gene tree for each locus. Thus, a first step in an analysis using STEM-hy is to estimate gene trees with branch lengths for each locus. Any method can be used to do this, but note a couple requirements: Branch lengths are assumed to be in units of expected number of substitutions per site per unit time. Branch lengths must be estimated subject to a molecular clock. This is not checked by the program. Gene trees must be fully resolved; however, polytomies can be included by setting branch lengths to 0 for an arbitrary resolution of the polytomy.
10 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Data Preparation - Population Genetics Parameters A value of the parameter θ = 4Nµ must be provided. Note that this is the per-site θ, not a per-locus value as used by other population genetics programs. This will be used to convert gene tree branch lengths to coalescent units (number of 2N generations) by dividing all gene tree branch lengths by θ. Estimates of θ could be obtained by standard methods. Typical values of θ will be between and 0.1.
11 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Data Preparation - Population Genetics and Mutation Parameters Each locus can also be given a rate multiplier. These can adjust for Variation in mutation rate across loci. Ploidy (e.g., haploid loci mtdna should be given a rate of 0.5). At the least, one should estimate rate variation from the data by something like the following: Compute average pairwise sequence divergence of each sequence to the outgroup. Divide all of these values by their overall mean, and assign that number as the rate multiplier for each gene. Adjust specific genes for ploidy, if necessary.
12 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Start with a small example where we can work things out by hand Four species, eight lineages, and two loci (N = 2) Suppose that the gene trees for the two loci are
13 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis STEM Now we can run STEM and look at output First, let s compute the relevant distances by hand: {Dab 1 }: {Dab 2 }:
14 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis STEM Now we can run STEM and look at output First, let s compute the relevant distances by hand: S1 S2 S3 S4 S S S3-1.2 S S1 S2 S3 S4 S S S3-1.1 S4 -
15 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis STEM Now we can run STEM and look at output First, let s compute the relevant distances by hand: S1 S2 S3 S4 S S S3-1.2 S4 - S1 S2 S3 S4 S S S3-1.1 S4 - S1 S2 S3 S4 S S S3-1.1 S4 -
16 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis STEM First, let s compute the relevant distances by hand: S1 S2 S3 S4 S S S3-1.2 S4 - S1 S2 S3 S4 S S S3-1.1 S4-1.2 S1 S2 S3 S4 S S S3-1.1 S
17 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Step 1: Prepare the gene trees Option 1: Place all gene trees in a single file called genetrees.tre: Newick format required One gene tree per line Rate multipliers must be given in brackets in front of each gene tree [1.0](((Name1: ,Name2: ): ,(Name3: ,Name4: ): ):0.0010, ((Name5:0.0010,Name6:0.0010):0.0014,(MyName7:0.0012,Name8:0.0012):0.0012): ); [1.0]((((Name1: ,Name2: ): ,(Name3:0.0012,Name4:0.0012): ):0.0003, (Name5:0.0010,Name6:0.0010): ): ,(MyName7:0.0011,Name8:0.0011):0.0024);
18 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Step 1: Prepare the gene trees Option 2: Place sets of gene trees in separate files File names will be supplied to STEM-hy in the settings file Rate multipliers will also be supplied in the settings file All genes in a single file are assumed to have the same rate genetrees1.tre: (((Name1: ,Name2: ): ,(Name3: ,Name4: ): ):0.0010, ((Name5:0.0010,Name6:0.0010):0.0014,(MyName7:0.0012,Name8:0.0012):0.0012): ); genetrees2.tre: ((((Name1: ,Name2: ): ,(Name3:0.0012,Name4:0.0012): ):0.0003, (Name5:0.0010,Name6:0.0010): ): ,(MyName7:0.0011,Name8:0.0011):0.0024);
19 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Step 2: Prepare the settings file - input option 1 yaml format: headings with indented parameters defined below properties: species: run: 1 #0=user-tree, 1=MLE, 2=search, 3=hybridization, 4=bootstrap theta: num saved trees: 15 beta: seed: Species1: Name1, Name2, Name3 Species2: Name4, Name5 Species3: Name6, MyName7 Species4: Name8
20 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Step 2: Prepare the settings file - input option 2 yaml format: headings with indented parameters defined below properties: species: files: run: 1 #0=user-tree, 1=MLE, 2=search, 3=hybridization, 4=bootstrap theta: num saved trees: 15 beta: seed: Species1: Name1, Name2, Name3 Species2: Name4, Name5 Species3: Name6, MyName7 Species4: Name8 genetrees1.tre: 1.0 # notice the space after each : genetrees2.tre: 1.0
21 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Step 2: Prepare the settings file yaml format: headings with indented parameters defined below properties: species: files: run: 1 #0=user-tree, 1=MLE, 2=search, 3=... theta: num saved trees: 15 beta: seed: Species1: Name1, Name2, Name3 Species2: Name4, Name5 Species3: Name6, MyName7 Species4: Name8 genetrees1.tre: 1.0 # notice the space after each : genetrees2.tre: 1.0 Some parameters will only be used for certain run settings. They are ignored otherwise, and can be omitted from the settings file.
22 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis - Results Analysis 1: Find the ML species tree (run with run: 1) Run at the command line with: java -jar stem-hy.jar *************************************** ** Welcome to STEM 2.0 ** *************************************** The settings file was successfully parsed... Using theta = The settings file contained 4 species and 8 lineages. The species-to-lineage mappings are: Species4: Name8 Species3: MyName7, Name6 Species2: Name4, Name5 Species1: Name1, Name2, Name3
23 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis - Results Analysis 1: Find the ML species tree (run with run: 1) Run at the command line with: java -jar stem.jar Results are written to the file mle.tre ****************Results***************** D AB Matrix: [ ] [ ] [ ] [ ] Likelihood Species Tree (Newick format): (Species1: ,(Species4: ,(Species2: ,Species3: ): ): ); Log likelihood for tree: ****************** Done ****************
24 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis - Results Analysis 2: Find likelihood of all 15 trees (run with run: 2) Output files: *************************************** ** Welcome to STEM 2.0 ** *************************************** The settings file was successfully parsed Beginning search now (this could take a while)... Search completed. Here are the results (also written to file search.tre ): [ ] (Species1: ,(Species4: ,(Species2: ,Species3: ): ): ); [ ] (Species1: ,(Species3: ,(Species2: ,Species4: ): ): ); [ ] ((Species4: ,Species1: ): ,(Species2: ,Species3: ): ); [ ] (Species4: ,(Species1: ,(Species2: ,Species3: ): ): ); [ ] (Species4: ,(Species2: ,(Species1: ,Species3: ): ): ); [ ] ((Species1: ,Species3: ): ,(Species2: ,Species4: ): ); [ ] (Species3: ,(Species1: ,(Species2: ,Species4: ): ): ); [ ] (Species2: ,((Species1: ,Species4: ): ,Species3: ): ); [ ] (Species2: ,((Species1: ,Species3: ): ,Species4: ): );
25 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis - Results Analysis 3: Find the likelihood of a particular species tree Place the tree(s) of interest in the file user tree in the same directory as STEM-hy ((Species1: ,Species3: ): ,(Species2: ,Species4: ): ); Branch lengths must be included. STEM-hy gives the likelihood of the tree with the user-specified branch lengths, as well as the ML branch lengths along the user tree.
26 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis - Results *************************************** ** Welcome to STEM 2.0 ** *************************************** The settings file was successfully parsed Read 1 species tree[s] from user.tre ****************Results***************** User tree: ((Species1: ,Species3: ): ,(Species2: ,Species4: ): ) Log likelihood for tree: **************Optimized Trees************ Optimized user tree: ((Species1: ,Species3: ): ,(Species2: ,Species4: ): ); Log likelihood: ****************** Done ****************
27 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Example 2: Missing Data Example genetrees.tre: [1.0](((Name1: ,Name2: ): ,(Name3: ,Name4: ): ):0.0010, ((Name5:0.0010,Name6:0.0010):0.0014,(MyName7:0.0012,Name8:0.0012):0.0012): ); [1.0]((((Name1: ,Name2: ): ,(Name3:0.0012,Name4:0.0012): ): , (Name5:0.0010,Name6:0.0010): ); [1.0](((Name1: ,Name2: ): ,(Name3: ,Name4: ): ):0.0010, ((Name5:0.0010,Name6:0.0010):0.0014,(MyName7:0.0012,Name8:0.0012):0.0012): ); [1.0]((((Name1: ,Name2: ): ,(Name3:0.0012,Name4:0.0012): ):0.0003, (Name5:0.0010,Name6:0.0010): ): ,(MyName7:0.0011,Name8:0.0011):0.0024); [1.0](((Name1: ,Name2: ): ,(Name3: ,Name4: ): ):0.0010, ((Name5:0.0010,Name6:0.0010):0.0014,(MyName7:0.0012,Name8:0.0012):0.0012): ); [1.0]((((Name1: ,Name2: ): ,(Name3:0.0012,Name4:0.0012): ):0.0003, (Name5:0.0010,Name6:0.0010): ): ,(MyName7:0.0011,Name8:0.0011):0.0024); [1.0](((Name1: ,Name2: ): ,(Name3: ,Name4: ): ):0.0010, ((Name5:0.0010,Name6:0.0010):0.0014,(MyName7:0.0012,Name8:0.0012):0.0012): ); [1.0]((((Name1: ,Name2: ): ,(Name3:0.0012,Name4:0.0012): ):0.0003, (Name5:0.0010,Name6:0.0010): ): ,(MyName7:0.0011,Name8:0.0011):0.0024);
28 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Example 2: Missing Data Example Look at gene trees: Name8 MyName7 Name6 Name5 Name4 Name3 Name2 Name1 Name6 Name5 Name4 Name3 Name2 Name1 Name8 MyName7 Name6 Name5 Name4 Name3 Name2 Name1 4 loci 1 locus 3 loci
29 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Example 2: Missing Data Example Note: The settings file remains unchanged. Below is the output. ****************Results***************** D AB Matrix: [ ] [ ] [ ] [ ] Maximum Likelihood Species Tree (Newick format): (Species1: ,(Species4: ,(Species2: ,Species3: ): ): ); log likelihood for tree: ****************** Done ****************
30 What is STEM-hy? Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Example Data: Heliconius Butterflies ABCD 3 2 BCD BD CD 1 H. hecale H. melpomene H. heurippa H. cydno
31 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Example 3: Bootstrap Analysis The current version of STEM-hy can be used to estimate bootstrap proportions on the ML tree, as well as to construct a bootstrap consensus tree. Sequence data must be provided in PHYLIP format (separate files need to be used for each gene). Each gene is bootstrapped a user-specified number of times, B, to produce B bootstrap samples (alignments) for each gene. Gene trees are estimated for each bootstrap sample using the program SSA. This program uses a simulated annealing method to estimate gene trees under the assumption of a molecular clock. B species trees are reconstructed using STEM-hy and printed to both the screen and to the file bootstrap.results.
32 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Example 3: Bootstrap Analysis For this example, we ll consider four taxa and six genes in Heliconius butterflies. The settings file is shown below, with changes in blue properties: species: run: 4 #0=user-tree, 1=MLE, 2=search, 3=hybridization, 4=bootstrap bootstrap samples: 100 phylip files: co 4tax.phy,dll 4tax.phy,inv 4tax.phy,sd 4tax.phy,tpi 4tax.phy,white 4tax.phy theta: 0.01 num saved trees: 15 beta: seed: H. melpomene: M95 H. hecale: Hh H. cordula: M187 H. heurippa: Strib40
33 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Example 3: Bootstrap Analysis Below is the output. All bootstrap trees are written to a file called bootstrap.results and can be read into another program and summarized.... The species-to-lineage mappings are: H. heurippa: Strib40 H. cordula: M187 H. hecale: Hh H. melpomene: M95 Bootstrapping trees (this might take a while)... ****************Results***************** The maximum likelihood species tree estimate is: (H. hecale: ,(h. melpomene: ,(h. heurippa: ,h. cordula: ): ): ); The 100 bootstrapped species trees: (H. heurippa: ,(h. hecale: ,(h. melpomene: ,h. cydno: ): ): ); (H. hecale: ,(h. melpomene: ,(h. heurippa: ,h. cydno: ): ): );
34 Data Preparation Example 2: Small Example with Missing Data Example 3: Bootstrap Analysis Some Notes on Program Versions There are some important differences between STEMv1.1a and STEMv2.0/STEM-hyv1.0 Multifurcations are handled differently. STEM v1.1a and lower: Zero-length branches are set to STEMv2.0 / STEM-hyv1.0: Zero-length branches are treated as missing data. Other big differences are improvements to input format and increased functionality in later versions.
35 Background: STEM s Hybrid Species Models STEM s Hybrid Species Model τ γ τ A B C P(C(AB)) = 1 (2/3)exp( τ) P(A(BC))=(1/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process A B C 1 γ A τ B C P(C(AB))=(1/3)exp( τ) P(A(BC))=1 (2/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process
36 Background: STEM s Hybrid Species Models STEM s Hybrid Species Model Species tree subject to hybridization τ γ τ A B C P(C(AB)) = 1 (2/3)exp( τ) P(A(BC))=(1/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process A B C 1 γ A τ B C P(C(AB))=(1/3)exp( τ) P(A(BC))=1 (2/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process
37 Background: STEM s Hybrid Species Models STEM s Hybrid Species Model Hybridization parameter to model the extent of the contribution from each parent τ γ τ A B C P(C(AB)) = 1 (2/3)exp( τ) P(A(BC))=(1/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process A B C 1 γ A τ B C P(C(AB))=(1/3)exp( τ) P(A(BC))=1 (2/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process
38 Background: STEM s Hybrid Species Models STEM s Hybrid Species Model Possible parental species trees τ γ τ A B C P(C(AB)) = 1 (2/3)exp( τ) P(A(BC))=(1/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process A B C 1 γ A τ B C P(C(AB))=(1/3)exp( τ) P(A(BC))=1 (2/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process
39 Background: STEM s Hybrid Species Models STEM s Hybrid Species Model Probabilities associated with each gene tree topology for each parental tree under the coalescent model τ γ τ A B C P(C(AB)) = 1 (2/3)exp( τ) P(A(BC))=(1/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process A B C 1 γ A τ B C P(C(AB))=(1/3)exp( τ) P(A(BC))=1 (2/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process
40 Background: STEM s Hybrid Species Models STEM s Hybrid Species Model Sequence evolution proceeds along gene trees τ γ τ A B C P(C(AB)) = 1 (2/3)exp( τ) P(A(BC))=(1/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process A B C 1 γ A τ B C P(C(AB))=(1/3)exp( τ) P(A(BC))=1 (2/3)exp( τ) P(B(AC))=(1/3)exp( τ) Mutation Process
41 Background: STEM s Hybrid Species Models Inference of Trees Subject to Hybridization Assumptions: Hybridization results in a mosaic genome, so that a sampled gene has a probability distribution that its history originated from one of several parental species trees Genes in the sample are independent given the species tree Hybridization events happen only between sister taxa No factors other than coalescence and hybridization lead to incongruence between gene trees and the species tree
42 Background: STEM s Hybrid Species Models Likelihood Calculation for the Three-taxon Case Let f (g i S) be the probability density of gene tree g i given species tree S under the coalescent model (Rannala and Yang, 2003)
43 Background: STEM s Hybrid Species Models Likelihood Calculation for the Three-taxon Case Let f (g i S) be the probability density of gene tree g i given species tree S under the coalescent model (Rannala and Yang, 2003) The likelihood function for the three-taxon case is N {γf (g i S 1 ) + (1 γ)f (g i S 2 )} i=1 where S 1 and S 2 are two possible parental species trees γ [0, 1]
44 Background: STEM s Hybrid Species Models Likelihood Calculation for the Three-taxon Case N {γf (g i S 1 ) + (1 γ)f (g i S 2 )} i=1 τ γ τ A B C f(g S1) Mutation Process A B C 1 γ A τ B C f(g S2) Mutation Process
45 Background: STEM s Hybrid Species Models Beyond Three Taxa... Propose a method which incorporates any number of hybridization events, provided they occur between sister taxa Each putative hybridization event is assigned a parameter, γ 1, γ 2,... The likelihood is computed by looking at all combinations of possible parental species trees, weighted appropriately by the γ j parameters
46 Background: STEM s Hybrid Species Models A Bigger Example Motivating example: A B C D E F A B C D E F A B C D E F A B C D E F
47 Background: STEM s Hybrid Species Models A Bigger Example Consider the hybrid species tree: Motivating example: A B C D E F A B C D E F A B C D E F A B C D E F A B C D E F
48 Background: STEM s Hybrid Species Models The Likelihood Function S 1 S 3 A B C D E F A B C D E F γ 1 γ 2 S 2 A B C D E F (1 γ 1 )γ 2 S 4 A B C D E F γ 1 1 γ 2 ) A B C D E F (1 γ 1 )(1 γ 2 ) N i=1 {γ 1 γ 2 f (g i S 1 ) + γ 1 (1 γ 2 )f (g i S 2 ) +(1 γ 1 )γ 2 f (g i S 3 ) + (1 γ 1 )(1 γ 2 )f (g i S 4 )}
49 Background: STEM s Hybrid Species Models Comments on Computation Parameters in the likelihood function: γ 1, γ 2, branch lengths For a given hybrid species tree and sample of gene trees with divergence times, maximum likelihood branch lengths can be analytically determined Fitting the likelihood model for a hypothesized hybrid species tree only requires optimization of γ parameters Implemented in a modified version of the program STEM, called STEM-hy
50 Background: STEM s Hybrid Species Models Selecting the Best Hybrid Species Tree For the example hybrid species tree, pick the best hybrid model from among possible models using the AIC: Model Tree γ 1 γ 2 Number of Parameters 1 A B C D E F A B C D E F A B C D E F A B C D E F 1 1 5
51 Background: STEM s Hybrid Species Models Selecting the Best Hybrid Species Tree Model Tree γ 1 γ 2 Number of Parameters A B C D E F 5 0 (0,1) 6 A B C D E F 6 1 (0,1) 6 A B C D E F 7 (0,1) 0 6 A B C D E F 8 (0,1) 1 6 A B C D E F 9 (0,1) (0,1) 7
52 Background: STEM s Hybrid Species Models STEM-hy: Assumptions In practice, the γ i are not given (neither are times of speciation or hybridization events). The algorithm finds MLEs for these parameters. STEM-hy inherits all of STEM-hy s other assumptions (e.g., no gene flow after speciation if no hybridization, gene tree variability is not taken into consideration, etc.).
53 Background: STEM s Hybrid Species Models STEM-hy: Assumptions One important point: STEM-hy looks for evidence of hybridization in the presence of incomplete lineage sorting. By using the model in STEM-hy to compute likelihoods, the coalescent process is incorporated. The AIC is used to compare models: AIC = 2lnL(M D) + 2k where M is the model and D is the data. LnL(M D) is the likelihood from STEM-hy for the hybridization model under consideration.
54 Background: STEM s Hybrid Species Models Input data format is the same as for previous analyses: Gene trees are placed in the file called genetrees.tre (option 1) or the files containing the gene trees are listed in the settings file (option 2). The settings file (in yaml format) is used to give user settings (e.g., θ). The run option is set to 3.
55 Background: STEM s Hybrid Species Models The user must additionally provide information about hybridization: The only option at present is to use a user-specified tree the present version of the program assumes that the overall species phylogeny is known. The user-specified tree is one of the possible parental trees it doesn t matter which one. The putative hybrid species are identified in the settings.yaml file.
56 What is STEM-hy? Background: STEM s Hybrid Species Models ABCD 3 2 BCD BD CD 1 H. hecale H. melpomene H. heurippa H. cydno
57 Background: STEM s Hybrid Species Models STEM-hy Example: Heliconius Butterflies Example genetrees.tre file: [ ]((Hheurippa: ,(Hcydno: ,Hmelpomene: ): ): ,Hhecale: ); [ ]((Hmelpomene: ,(Hcydno: ,Hheurippa: ): ):0.001,Hhecale: ); [ ](((Hcydno: ,Hheurippa: ): ,Hmelpomene: ): ,Hhecale: ); [ ](((Hheurippa: ,Hcydno: ): ,Hmelpomene: ): ,Hhecale: ); [ ](((Hheurippa: ,Hmelpomene: ): ,Hcydno: ): ,Hhecale: ); [ ](((Hheurippa: ,Hcydno: ): ,Hmelpomene: ): ,Hhecale: );
58 Background: STEM s Hybrid Species Models STEM-hy Example: Heliconius Butterflies Example settings file: properties: species: run: 3 theta: beta: burnin: 100 seed: bound total iter: 20 num saved trees: 10 hybrid species: H. heurippa hybrid tree: user-heliconius.tre H. melpomene: M95 H. hecale: Hh H. cordula: M187 H. heurippa: Strib40
59 Background: STEM s Hybrid Species Models Example user-heliconius.tre: (((H. heurippa: ,h. cydno: ): ,h. melpomene: ): ,h. hecale: );
60 Background: STEM s Hybrid Species Models ****************Results*****************... Parental trees: gamma(h. heurippa) = 1 ((H. cydno: ,(h. heurippa: ,h. melpomene: ): ): ,h. hecale: ); Lik: AIC: k: 3 gamma(h. heurippa) = 0 (((H. heurippa: ,h. cydno: ): ,h. melpomene: ): ,h. hecale: ); Lik: AIC: k: 3 Hybrid trees: (((H. heurippa: ,h. cydno: ): ,h. melpomene: ): ,h. hecale: ); Lik: gamma(h. heurippa): AIC: k: 4 ****************** Done ****************
61 Background: STEM s Hybrid Species Models What hybrid species can be considered? Care must be taken in selecting hybrid species: Both members of a sister group cannot be selected as hybrid taxa in a single analysis. However, two analyses can be run (one with each of the sister group identified as the hybrid) and results will be comparable across runs. The outgroup cannot be selected as a hybrid. Both of these restrictions result from the fact that for now hybridization is only considered between sister taxa. More general hybridization relationships can be considered by hand using the user-specified tree feature of STEM-hy.
62 Background: STEM s Hybrid Species Models STEM-hy: Strengths and Weaknesses STEM-hy makes some fairly strong assumptions: Error in estimating gene trees and branch lengths is not incorporated!!!! But the possibility of carrying out bootstrap analysis helps. Information in the sequence data is not used directly; it is only used as summarized by estimated gene divergence times. There is a single value of θ for the entire tree.
63 Background: STEM s Hybrid Species Models STEM-hy: Strengths and Weaknesses STEM-hy makes some fairly strong assumptions: Error in estimating gene trees and branch lengths is not incorporated!!!! But the possibility of carrying out bootstrap analysis helps. Information in the sequence data is not used directly; it is only used as summarized by estimated gene divergence times. There is a single value of θ for the entire tree. There are trade-offs involved, and STEM-hy does some things well: It is quick (even the tree search does not take long). It can handle missing data easily and intuitively. Simulations demonstrate reasonable performance (unlikely to be misleading; may be uninformative).
64 Challenge Datasets I ve created four datasets under varying conditions: M1 No hybridization, long intervals between speciation events. M2 No hybridization, short intervals between speciation events. M3 Low-levels of hybridization - B is a hybrid of A and C (species tree as in M1 and M2). M4 Extensive hybridization - B is a hybrid of A and C (species tree as in M1 and M2). All data sets have 6 species, 2 individuals/species, and 10 loci. GOAL: match the data set to the condition listed above Solutions are at lkubatko/solutions.html
65 STEM-hy Information, References, etc. Recommended citations - species tree estimation: Kubatko, L.S., B. C.Carstens, and L. L. Knowles STEM: Species Tree Estimation using Maximum likelihood under coalescence. Bioinformatics 25(7): Liu, L., L. Yu, and D.K. Pearl Maximum tree: a consistent estimator of the species tree. Journal of Mathematical Biology 60(1): Mossel, E. and S. Roch Incomplete lineage sorting: Consistent phylogeny estimation from multiple loci. IEEE/ACM Transactions on Computational Biology and Bioinformatics 7(1): Recommended citations - hybridization: Kubatko, LS Identifying Hybridization Events in the Presence of Coalescence via Model Selection, Systematic Biology 58(5): Thank you! STEM-hy is available at lkubatko/software/stem/ Questions concerning the programs can be sent to kubatko.2@osu.edu.
Anatomy of a species tree
Anatomy of a species tree T 1 Size of current and ancestral Populations (N) N Confidence in branches of species tree t/2n = 1 coalescent unit T 2 Branch lengths and divergence times of species & populations
More informationPhylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline
Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying
More informationQuartet Inference from SNP Data Under the Coalescent Model
Bioinformatics Advance Access published August 7, 2014 Quartet Inference from SNP Data Under the Coalescent Model Julia Chifman 1 and Laura Kubatko 2,3 1 Department of Cancer Biology, Wake Forest School
More informationTaming the Beast Workshop
Workshop and Chi Zhang June 28, 2016 1 / 19 Species tree Species tree the phylogeny representing the relationships among a group of species Figure adapted from [Rogers and Gibbs, 2014] Gene tree the phylogeny
More informationWenEtAl-biorxiv 2017/12/21 10:55 page 2 #2
WenEtAl-biorxiv 0// 0: page # Inferring Phylogenetic Networks Using PhyloNet Dingqiao Wen, Yun Yu, Jiafan Zhu, Luay Nakhleh,, Computer Science, Rice University, Houston, TX, USA; BioSciences, Rice University,
More informationPhyloNet. Yun Yu. Department of Computer Science Bioinformatics Group Rice University
PhyloNet Yun Yu Department of Computer Science Bioinformatics Group Rice University yy9@rice.edu Symposium And Software School 2016 The University Of Texas At Austin Installation System requirement: Java
More informationfirst (i.e., weaker) sense of the term, using a variety of algorithmic approaches. For example, some methods (e.g., *BEAST 20) co-estimate gene trees
Concatenation Analyses in the Presence of Incomplete Lineage Sorting May 22, 2015 Tree of Life Tandy Warnow Warnow T. Concatenation Analyses in the Presence of Incomplete Lineage Sorting.. 2015 May 22.
More informationProperties of Consensus Methods for Inferring Species Trees from Gene Trees
Syst. Biol. 58(1):35 54, 2009 Copyright c Society of Systematic Biologists DOI:10.1093/sysbio/syp008 Properties of Consensus Methods for Inferring Species Trees from Gene Trees JAMES H. DEGNAN 1,4,,MICHAEL
More informationAmira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationDr. Amira A. AL-Hosary
Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological
More informationPhylogenetic inference
Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the
More informationSpecies Tree Inference using SVDquartets
Species Tree Inference using SVDquartets Laura Kubatko and Dave Swofford May 19, 2015 Laura Kubatko SVDquartets May 19, 2015 1 / 11 SVDquartets In this tutorial, we ll discuss several different data types:
More informationToday's project. Test input data Six alignments (from six independent markers) of Curcuma species
DNA sequences II Analyses of multiple sequence data datasets, incongruence tests, gene trees vs. species tree reconstruction, networks, detection of hybrid species DNA sequences II Test of congruence of
More informationUnderstanding How Stochasticity Impacts Reconstructions of Recent Species Divergent History. Huateng Huang
Understanding How Stochasticity Impacts Reconstructions of Recent Species Divergent History by Huateng Huang A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor
More informationUsing phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)
Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures
More informationWorkshop III: Evolutionary Genomics
Identifying Species Trees from Gene Trees Elizabeth S. Allman University of Alaska IPAM Los Angeles, CA November 17, 2011 Workshop III: Evolutionary Genomics Collaborators The work in today s talk is joint
More informationIntraspecific gene genealogies: trees grafting into networks
Intraspecific gene genealogies: trees grafting into networks by David Posada & Keith A. Crandall Kessy Abarenkov Tartu, 2004 Article describes: Population genetics principles Intraspecific genetic variation
More informationC3020 Molecular Evolution. Exercises #3: Phylogenetics
C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from
More informationIncomplete Lineage Sorting: Consistent Phylogeny Estimation From Multiple Loci
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1-2010 Incomplete Lineage Sorting: Consistent Phylogeny Estimation From Multiple Loci Elchanan Mossel University of
More informationTo link to this article: DOI: / URL:
This article was downloaded by:[ohio State University Libraries] [Ohio State University Libraries] On: 22 February 2007 Access Details: [subscription number 731699053] Publisher: Taylor & Francis Informa
More informationJed Chou. April 13, 2015
of of CS598 AGB April 13, 2015 Overview of 1 2 3 4 5 Competing Approaches of Two competing approaches to species tree inference: Summary methods: estimate a tree on each gene alignment then combine gene
More informationTree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny
More informationInferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies
Inferring Phylogenetic Trees Distance Approaches Representing distances in rooted and unrooted trees The distance approach to phylogenies given: an n n matrix M where M ij is the distance between taxa
More informationNJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees
NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationEfficient Bayesian Species Tree Inference under the Multispecies Coalescent
Syst. Biol. 66(5):823 842, 2017 The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com
More informationGene Genealogies Coalescence Theory. Annabelle Haudry Glasgow, July 2009
Gene Genealogies Coalescence Theory Annabelle Haudry Glasgow, July 2009 What could tell a gene genealogy? How much diversity in the population? Has the demographic size of the population changed? How?
More informationCoalescent Histories on Phylogenetic Networks and Detection of Hybridization Despite Incomplete Lineage Sorting
Syst. Biol. 60(2):138 149, 2011 c The Author(s) 2011. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com
More informationMaximum Likelihood Inference of Reticulate Evolutionary Histories
Maximum Likelihood Inference of Reticulate Evolutionary Histories Luay Nakhleh Department of Computer Science Rice University The 2015 Phylogenomics Symposium and Software School The University of Michigan,
More informationPhylogenomics. Jeffrey P. Townsend Department of Ecology and Evolutionary Biology Yale University. Tuesday, January 29, 13
Phylogenomics Jeffrey P. Townsend Department of Ecology and Evolutionary Biology Yale University How may we improve our inferences? How may we improve our inferences? Inferences Data How may we improve
More informationPhylogenetic analyses. Kirsi Kostamo
Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,
More informationPhylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center
Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods
More informationMichael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D
7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood
More informationEstimating Evolutionary Trees. Phylogenetic Methods
Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent
More informationSupplemental Information Likelihood-based inference in isolation-by-distance models using the spatial distribution of low-frequency alleles
Supplemental Information Likelihood-based inference in isolation-by-distance models using the spatial distribution of low-frequency alleles John Novembre and Montgomery Slatkin Supplementary Methods To
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods
More informationSpeciesNetwork Tutorial
SpeciesNetwork Tutorial Inferring Species Networks from Multilocus Data Chi Zhang and Huw A. Ogilvie E-mail: zhangchi@ivpp.ac.cn January 21, 2018 Introduction This tutorial covers SpeciesNetwork, a fully
More informationPhylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi)
Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction Lesser Tenrec (Echinops telfairi) Goals: 1. Use phylogenetic experimental design theory to select optimal taxa to
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationIn comparisons of genomic sequences from multiple species, Challenges in Species Tree Estimation Under the Multispecies Coalescent Model REVIEW
REVIEW Challenges in Species Tree Estimation Under the Multispecies Coalescent Model Bo Xu* and Ziheng Yang*,,1 *Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China and Department
More informationInferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting
arxiv:1509.06075v3 [q-bio.pe] 12 Feb 2016 Inferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting Claudia Solís-Lemus 1 and Cécile Ané 1,2 1 Department of Statistics,
More informationThe Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection
The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection Yun Yu 1, James H. Degnan 2,3, Luay Nakhleh 1 * 1 Department of Computer Science, Rice
More informationPhylogenetic Networks, Trees, and Clusters
Phylogenetic Networks, Trees, and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science Rice University Houston, TX 77005, USA nakhleh@cs.rice.edu 2 Department of Biology University
More informationPhylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?
Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species
More informationQ1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.
OEB 242 Exam Practice Problems Answer Key Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. First, recall
More informationASTRAL: Fast coalescent-based computation of the species tree topology, branch lengths, and local branch support
ASTRAL: Fast coalescent-based computation of the species tree topology, branch lengths, and local branch support Siavash Mirarab University of California, San Diego Joint work with Tandy Warnow Erfan Sayyari
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More informationEfficient Bayesian species tree inference under the multi-species coalescent
Efficient Bayesian species tree inference under the multi-species coalescent arxiv:1512.03843v1 [q-bio.pe] 11 Dec 2015 Bruce Rannala 1 and Ziheng Yang 2 1 Department of Evolution & Ecology, University
More informationPhylogeny Tree Algorithms
Phylogeny Tree lgorithms Jianlin heng, PhD School of Electrical Engineering and omputer Science University of entral Florida 2006 Free for academic use. opyright @ Jianlin heng & original sources for some
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley B.D. Mishler March 31, 2011. Reticulation,"Phylogeography," and Population Biology:
More information8/23/2014. Phylogeny and the Tree of Life
Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major
More informationI. Short Answer Questions DO ALL QUESTIONS
EVOLUTION 313 FINAL EXAM Part 1 Saturday, 7 May 2005 page 1 I. Short Answer Questions DO ALL QUESTIONS SAQ #1. Please state and BRIEFLY explain the major objectives of this course in evolution. Recall
More informationESS 345 Ichthyology. Systematic Ichthyology Part II Not in Book
ESS 345 Ichthyology Systematic Ichthyology Part II Not in Book Thought for today: Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else,
More informationPhyQuart-A new algorithm to avoid systematic bias & phylogenetic incongruence
PhyQuart-A new algorithm to avoid systematic bias & phylogenetic incongruence Are directed quartets the key for more reliable supertrees? Patrick Kück Department of Life Science, Vertebrates Division,
More informationBioinformatics tools for phylogeny and visualization. Yanbin Yin
Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and
More informationEvaluation of a Bayesian Coalescent Method of Species Delimitation
Syst. Biol. 60(6):747 761, 2011 c The Author(s) 2011. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com
More informationIntegrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2012 University of California, Berkeley
Integrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2012 University of California, Berkeley B.D. Mishler April 12, 2012. Phylogenetic trees IX: Below the "species level;" phylogeography; dealing
More informationPhylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.
Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class
More information9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)
I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by
More informationInferring Speciation Times under an Episodic Molecular Clock
Syst. Biol. 56(3):453 466, 2007 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150701420643 Inferring Speciation Times under an Episodic Molecular
More informationEvolutionary Tree Analysis. Overview
CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based
More informationPhylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz
Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels
More informationPhylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X
More informationWhat is Phylogenetics
What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)
More informationIntegrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley
Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley B.D. Mishler Feb. 14, 2018. Phylogenetic trees VI: Dating in the 21st century: clocks, & calibrations;
More informationPhylogenetics in the Age of Genomics: Prospects and Challenges
Phylogenetics in the Age of Genomics: Prospects and Challenges Antonis Rokas Department of Biological Sciences, Vanderbilt University http://as.vanderbilt.edu/rokaslab http://pubmed2wordle.appspot.com/
More informationConsensus methods. Strict consensus methods
Consensus methods A consensus tree is a summary of the agreement among a set of fundamental trees There are many consensus methods that differ in: 1. the kind of agreement 2. the level of agreement Consensus
More informationPhylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.
Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationPhylogeny. November 7, 2017
Phylogeny November 7, 2017 Phylogenetics Phylon = tribe/race, genetikos = relative to birth Phylogenetics: study of evolutionary relationships among organisms, sequences, or anything in between Related
More informationTaxon: generally refers to any named group of organisms, such as species, genus, family, order, etc.. Node: represents the hypothetical ancestor
A quick review Taxon: generally refers to any named group of organisms, such as species, genus, family, order, etc.. Node: represents the hypothetical ancestor Branches: lines diverging from a node Root:
More informationBiology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29):
Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29): Statistical estimation of models of sequence evolution Phylogenetic inference using maximum likelihood:
More informationMOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS. Masatoshi Nei"
MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS Masatoshi Nei" Abstract: Phylogenetic trees: Recent advances in statistical methods for phylogenetic reconstruction and genetic diversity analysis were
More informationIntegrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2012 University of California, Berkeley
Integrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2012 University of California, Berkeley B.D. Mishler Feb. 7, 2012. Morphological data IV -- ontogeny & structure of plants The last frontier
More informationEstimating Species Phylogeny from Gene-Tree Probabilities Despite Incomplete Lineage Sorting: An Example from Melanoplus Grasshoppers
Syst. Biol. 56(3):400-411, 2007 Copyright Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150701405560 Estimating Species Phylogeny from Gene-Tree Probabilities
More informationA Phylogenetic Network Construction due to Constrained Recombination
A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer
More informationC.DARWIN ( )
C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley B.D. Mishler Feb. 1, 2011. Qualitative character evolution (cont.) - comparing
More informationDNA-based species delimitation
DNA-based species delimitation Phylogenetic species concept based on tree topologies Ø How to set species boundaries? Ø Automatic species delimitation? druhů? DNA barcoding Species boundaries recognized
More informationElements of Bioinformatics 14F01 TP5 -Phylogenetic analysis
Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis 10 December 2012 - Corrections - Exercise 1 Non-vertebrate chordates generally possess 2 homologs, vertebrates 3 or more gene copies; a Drosophila
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationConcepts and Methods in Molecular Divergence Time Estimation
Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks
More informationEstimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057
Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number
More informationHow to read and make phylogenetic trees Zuzana Starostová
How to read and make phylogenetic trees Zuzana Starostová How to make phylogenetic trees? Workflow: obtain DNA sequence quality check sequence alignment calculating genetic distances phylogeny estimation
More informationreconciling trees Stefanie Hartmann postdoc, Todd Vision s lab University of North Carolina the data
reconciling trees Stefanie Hartmann postdoc, Todd Vision s lab University of North Carolina 1 the data alignments and phylogenies for ~27,000 gene families from 140 plant species www.phytome.org publicly
More informationReconstructing the history of lineages
Reconstructing the history of lineages Class outline Systematics Phylogenetic systematics Phylogenetic trees and maps Class outline Definitions Systematics Phylogenetic systematics/cladistics Systematics
More informationProcesses of Evolution
15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley B.D. Mishler Jan. 22, 2009. Trees I. Summary of previous lecture: Hennigian
More informationBayesian Models for Phylogenetic Trees
Bayesian Models for Phylogenetic Trees Clarence Leung* 1 1 McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada ABSTRACT Introduction: Inferring genetic ancestry of different species
More informationFine-Scale Phylogenetic Discordance across the House Mouse Genome
Fine-Scale Phylogenetic Discordance across the House Mouse Genome Michael A. White 1,Cécile Ané 2,3, Colin N. Dewey 4,5,6, Bret R. Larget 2,3, Bret A. Payseur 1 * 1 Laboratory of Genetics, University of
More informationAnatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses
Anatomy of a tree outgroup: an early branching relative of the interest groups sister taxa: taxa derived from the same recent ancestor polytomy: >2 taxa emerge from a node Anatomy of a tree clade is group
More informationMethods to reconstruct phylogene1c networks accoun1ng for ILS
Methods to reconstruct phylogene1c networks accoun1ng for ILS Céline Scornavacca some slides have been kindly provided by Fabio Pardi ISE-M, Equipe Phylogénie & Evolu1on Moléculaires Montpellier, France
More informationmolecular evolution and phylogenetics
molecular evolution and phylogenetics Charlotte Darby Computational Genomics: Applied Comparative Genomics 2.13.18 https://www.thinglink.com/scene/762084640000311296 Internal node Root TIME Branch Leaves
More informationTheory of Evolution Charles Darwin
Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationBootstrapping and Tree reliability. Biol4230 Tues, March 13, 2018 Bill Pearson Pinn 6-057
Bootstrapping and Tree reliability Biol4230 Tues, March 13, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Rooting trees (outgroups) Bootstrapping given a set of sequences sample positions randomly,
More informationConsistency Index (CI)
Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)
More informationLecture 6 Phylogenetic Inference
Lecture 6 Phylogenetic Inference From Darwin s notebook in 1837 Charles Darwin Willi Hennig From The Origin in 1859 Cladistics Phylogenetic inference Willi Hennig, Cladistics 1. Clade, Monophyletic group,
More informationOMICS Journals are welcoming Submissions
OMICS Journals are welcoming Submissions OMICS International welcomes submissions that are original and technically so as to serve both the developing world and developed countries in the best possible
More information