Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Similar documents
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

How should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200 Spring 2018 University of California, Berkeley

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley

BMI/CS 776 Lecture 4. Colin Dewey

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Lecture 4. Models of DNA and protein change. Likelihood methods

Thanks to Paul Lewis and Joe Felsenstein for the use of slides

1. Can we use the CFN model for morphological traits?

Phylogenetic inference

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Constructing Evolutionary/Phylogenetic Trees

Dr. Amira A. AL-Hosary

Non-independence in Statistical Tests for Discrete Cross-species Data

Constructing Evolutionary/Phylogenetic Trees

Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26

Discrete & continuous characters: The threshold model

Inferring Molecular Phylogeny

Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22

Lecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) p.1/30

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

Reconstruire le passé biologique modèles, méthodes, performances, limites

Cladistics and Bioinformatics Questions 2013

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Concepts and Methods in Molecular Divergence Time Estimation

Evolutionary Models. Evolutionary Models

What Is Conservation?

Molecular Evolution & Phylogenetics

Lecture Notes: Markov chains

Phylogenetic Tree Reconstruction

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies

Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

A Bayesian Approach to Phylogenetics

Reading for Lecture 13 Release v10

Evolutionary Tree Analysis. Overview

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

7. Tests for selection

Statistical nonmolecular phylogenetics: can molecular phylogenies illuminate morphological evolution?

Letter to the Editor. Department of Biology, Arizona State University

Chapter 7: Models of discrete character evolution

Rapid evolution of the cerebellum in humans and other great apes

EVOLUTIONARY DISTANCES

Is the equal branch length model a parsimony model?

Biology 211 (2) Week 1 KEY!

Improving divergence time estimation in phylogenetics: more taxa vs. longer sequences

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)

Bayesian Phylogenetics

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses

A (short) introduction to phylogenetics

Supplemental Information Likelihood-based inference in isolation-by-distance models using the spatial distribution of low-frequency alleles

Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

The practice of naming and classifying organisms is called taxonomy.

Base Range A 0.00 to 0.25 C 0.25 to 0.50 G 0.50 to 0.75 T 0.75 to 1.00

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Lecture 6 Phylogenetic Inference

Bayesian Phylogenetics:

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Bootstrap confidence levels for phylogenetic trees B. Efron, E. Halloran, and S. Holmes, 1996

Phylogenetics. BIOL 7711 Computational Bioscience

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki

MCMC: Markov Chain Monte Carlo

Maximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018

BINF6201/8201. Molecular phylogenetic methods

进化树构建方法的概率方法 第 4 章 : 进化树构建的概率方法 问题介绍. 部分 lid 修改自 i i f l 的 ih l i

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Estimating Evolutionary Trees. Phylogenetic Methods

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

Bayesian Models for Phylogenetic Trees

Theory of Evolution Charles Darwin

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley

Algorithms in Bioinformatics

Approximate Bayesian Computation: a simulation based approach to inference

X X (2) X Pr(X = x θ) (3)

Consistency Index (CI)

Introduction to population genetics & evolution

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?

Molecular Evolution, course # Final Exam, May 3, 2006

Bayesian Phylogenetics

Computational methods for predicting protein-protein interactions

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES

arxiv: v1 [q-bio.pe] 4 Sep 2013

Quantifying sequence similarity

Some Aspects of Ancestor Reconstruction in the Study of Floral Assembly. Some Aspects of Ancestor Reconstruction in the Study of Floral Assembly

Approximate Inference

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

Bayesian Inference. Chapter 1. Introduction and basic concepts


Name: Class: Date: ID: A

Distances that Perfectly Mislead

Transcription:

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

A matter of confidence 0 1 1 short branches give more weight to inheritance explanation long branches give more weight to convergence explanation 0 1 1 more confident anc. state was 1 0 1 Copyright 2007 by Paul O. Lewis less confident anc. state was 1

Maximum likelihood ancestral state estimation 0 0 1 1 data from one character 0.2 0.2 0.1 0.1 0 1 trait absent trait present state here not of direct interest 0.1 branch length (expected number of changes per character)? ancestral state of interest Copyright 2007 by Paul O. Lewis

"Mk" models "Cavender-Farris" model: 2 possible states (k = 2) expected no. changes = μ t P ii (t) = 0.5 + 0.5 e -2μ t P ij (t) = 0.5-0.5 e -2μ t Haldane, J. B. S. 1919. The combination of linkage values and the calculation of distances between the loci of linked factors. Journal of Genetics 8:299-309. Cavender, J. A. 1978. Taxonomy with confidence. Mathematical Biosciences 40:271-280. Farris, J. S. 1973. A probability model for inferring evolutionary trees. Systematic Zoology 22:250-256. Felsenstein, J. 1981. A likelihood approach to character weighting and what it tells us about parsimony and compatibility. Biological Journal of the Linnean Society 16:183-196. Jukes, T. H., and C. R. Cantor. 1969. Evolution of protein molecules. Pages 21-132 in Mammalian Protein Metabolism (H. N. Munro, ed.) Academic Press, New York. Jukes-Cantor (1969) model: 4 possible states (k = 4) expected no. changes = 3 μ t P ii (t) = 0.25 + 0.75 e -4μ t P ij (t) = 0.25-0.25 e -4μ t Copyright 2007 by Paul O. Lewis

For branch of length 0.1 μ t = 0.1 P 00 (0.1) = 0.5 + 0.5 e -2(0.1) = 0.90936 P 01 (0.1) = 0.5-0.5 e -2(0.1) = 0. 09064 P 10 (0.1) = 0.5-0.5 e -2(0.1) = 0. 09064 P 11 (0.1) = 0.5 + 0.5 e -2(0.1) = 0.90936 Copyright 2007 by Paul O. Lewis

For branch of length 0.2 μ t = 0.2 P 00 (0.2) = 0.5 + 0.5 e -2(0.2) = 0.83516 P 01 (0.2) = 0.5-0.5 e -2(0.2) = 0. 16484 P 10 (0.2) = 0.5-0.5 e -2(0.2) = 0. 16484 P 11 (0.2) = 0.5 + 0.5 e -2(0.2) = 0. 83516 Copyright 2007 by Paul O. Lewis

Calculating conditional likelihoods 0 0 1 1 at upper node 0.2 0.2? 0.1 0.1 0.1 L 0 = P 01 (.1) P 01 (.1) = [.09064] 2 = 0.00822 L 1 = P 11 (.1) P 11 (.1) = [0.90936] 2 = 0.82694 L 0 and L 1 are known as conditional likelihoods. L 0, for example, is the likelihood of the subtree given that the node has state 0. L 0 = Pr(data above node node state = 0) Copyright 2007 by Paul O. Lew

Calculating likelihood at basal node 0 0 1 1 0.1 0.1 conditional on basal node having state 0 0.2 0.2 L 0 = 0.00822 L 1 = 0.82694 0 0.1 L 0 = P 00 (.2) P 00 (.2) P 00 (.1) L 0 + P 00 (.2) P 00 (.2) P 01 (.1) L 1 = (.83516) 2 (.90936) (.00822) + (.83516) 2 (.09064) (.82694) = 0.05749 Copyright 2007 by Paul O. Lewis

Calculating conditional likelihoods 0 0 1 1 at upper node 0.2 0.2? 0.1 0.1 0.1 L 0 = P 01 (.1) P 01 (.1) = [.09064] 2 = 0.00822 L 1 = P 11 (.1) P 11 (.1) = [0.90936] 2 = 0.82694 L 0 and L 1 are known as conditional likelihoods. L 0, for example, is the likelihood of the subtree given that the node has state 0. L 0 = Pr(data above node node state = 0) Copyright 2007 by Paul O. Lewis

Calculating likelihood of the tree 0 0 1 1 0.1 0.1 using conditional likelihoods at the basal node D here refers to the observed data at the tips 0.2 0.2 0.1 0 here refers to the state at the basal node L = Pr(0) Pr(D 0) + Pr(1) Pr(D 1) = (0.5) L 0 + (0.5) L 1 = (0.5) (0.05749) + (0.5) (0.02045) = (0.5)(0.07794) = 0.03897 lnl = -3.24496 Copyright 2007 by Paul O. Lewis L 0 = 0.05749 L 1 = 0.02045

Choosing the maximum likelihood ancestral 0 0 1 1 0.1 0.1 0.2 0.2 state is simply a matter of seeing which conditional likelihood is the largest 0.1 In this case, L 0 is larger of the two so, if forced to decide, you would go with 0 as the ancestral state. Copyright 2007 by Paul O. Lewis L 0 = 0.05749 L 1 = 0.02045 (73.8%) (26.2%)

Is this a ML or Bayesian approach? L 0 = 0.05749 L 1 = 0.02045 (73.8%) (26.2%) 0.738 = posterior probability of state 0 prior likelihood (0.5) (0.05749) (0.5) (0.05749) + (0.5) (0.02045) marginal probability of the data Converting likelihoods to percentages produces Bayesian posterior probabilities (and implicitly assumes a flat prior over ancestral states) Yang, Z., S. Kumar, and M. Nei. 1995. A new method of inference of ancestral nucleotide and amino-acid sequences. Genetics 141:1641-1650. Copyright 2007 by Paul O. Lewis

Empirical Bayes vs. Full Bayes The likelihood used in the previous slide is the maximum likelihood, i.e. the likelihood at one particular combination of branch length values (and values of other model parameters). A fully Bayesian approach would approximate the posterior probability associated with state 0 and state 1 at the node of interest using an MCMC analysis that allowed all the model parameters to vary, including the branch lengths. Because some parameters are fixed at their maximum likelihood estimates, this approach is often called an empirical Bayesian method. Copyright 2007 by Paul O. Lewis

Marginal vs. Joint Estimation marginal estimation involves summing likelihood over all states at all nodes except one of interest (most common) joint estimation involves finding the combination of states at all nodes that together account for the highest likelihood (seldom used) Copyright 2007 by Paul O. Lewis

0 0 1 1 Joint Estimation 0.1 0.1 0.2 0.2 0 Find likelihood of this particular combination of ancestral states (call it L 00 ) 0.1 0 L 00 = 0.5 P 00 (.2) P 00 (.2) P 00 (.1) P 01 (.1) P 01 (.1) = (0.5) (.83516) 2 (.90936) (.09064) 2 = 0.00261 Copyright 2007 by Paul O. Lewis

0 0 1 1 Joint 0.1 0.1 Estimation 0.2 0.2 0.1 1 Now we are calculating the likelihood of a different combination of ancestral states (call this one L 01 ) 0 L 01 = 0.5 P 00 (.2) P 00 (.2) P 01 (.1) P 11 (.1) P 11 (.1) = (0.5) (.83516) 2 (.09064) (.90936) 2 = 0.02614 Copyright 2007 by Paul O. Lewis

0 0 1 1 Joint 0.1 0.1 Estimation 0.2 0.2 X 0.1 Y Summary of the four possibilities: L 00 = 0.00261 L 01 = 0.02614 L 10 = 0.00003 L 11 = 0.01022 best Copyright 2007 by Paul O. Lewis X Y

Returning to marginal estimation... 0 0 1 1 0.2 0.2 0.1 0.1 0.1 What about a pie diagram for this node? Previously, we computed these conditional likelihoods for this node: L 0 = 0.00822 L 1 = 0.82694 Can we just make a pie diagram from these two values? Copyright 2007 by Paul O. Lewis

Returning to marginal estimation... 0 0 1 1 0.2 0.2 0.1 0.1 The conditional likelihoods we computed only included data above this node (i.e. only half the observed states) 0.1 Copyright 2007 by Paul O. Lewis

Estimation of where changes take place 1. What is the probability that a change 0 1 took place on a particular branch in the tree? 2. What is the probability that the character changed state exactly x times on the tree? Should we answer these questions with a joint or marginal ancestral character state reconstruction?

No

We should figure out the probability of explaining the character state distribution with exactly the set of changes we are interested in. Marginalize over all states that are not crucial to the question

1. What is the probability that a change 0 1 took place on a particular branch in the tree? fix the ancestral node to 0 and the descendant to 1, calculate the likelihood and divide that by the total likelihood. 2. What is the probability that the character changed state exactly x times on the tree?

1. What is the probability that a change 0 1 took place on a particular branch in the tree? 2. What is the probability that the character changed state exactly x times on the tree? Sum the likelihoods over all reconstructions that have exactly x changes. This is not easy to do! Recall that Pr(0 1 ν, θ) from our models accounts for multiple hits. A solution is to sample histories according to their posterior probabilities and summarize these samples.

Ancestral states are one thing... But wouldn't it be nice to estimate not only what happened at the nodes, but also what happened along the edges of the tree? 2007 by Paul O. Lewis 5

Stochastic mapping simulates an entire character history consistent with the observed character data. In this case, green means perennial and blue means annual in the flowering plant genus Polygonella 2007 by Paul O. Lewis

Three Mappings for Lifespan Blue = annual Green = perennial Each simulation produces a different history (or mapping) consistent with the observed states. 2007 by Paul O. Lewis

Kinds of questions you can answer using stochastic mapping How long, on average, does a character spend in state 0? In state 1? Is the total time in which two characters are both in state 1 more than would be expected if they were evolving independently? What is the number of switches from state 0 to state 1 for a character (on average)? 2007 by Paul O. Lewis

Step 1: Downpass to obtain conditional likelihoods This part reflects the calculations needed to estimate ancestral states. 2007 by Paul O. Lewis

Step 2: Up-pass to sample states at the interior nodes This part resembles a standard simulation, with the exception that we must conform to the observed states at the tips Posterior probability Likelihood Prior probability 2007 by Paul O. Lewis

Choosing a state at an interior node Posterior probability of state 0 Posterior probability of state 1 Once the posterior probabilities have been calculated for all three states, a virtual "Roulette wheel" or "pie" can be constructed and used to choose a state at random. In this case, the posterior distribution was: {0.1, 0.2, 0.7} Posterior probability of state 2 2007 by Paul O. Lewis

Sampling state at node 6 Have already chosen state at the root node 2007 by Paul O. Lewis Note: only the calculation for x 6 =0 is presented (but you must do a similar calculation for all three states)

Sampling state at node 5 2007 by Paul O. Lewis Note: only the calculation for x 5 =0 is presented

2007 by Paul O. Lewis "Mapping" edges Mapping an edge means choosing an evolutionary history for a character along a particular edge that is consistent with the observed states at the tips, the chosen states at the interior nodes, and the model of character change being assumed This is done by choosing random sojourn times and "causing" a change in state at the end of each of these sojourns What is a sojourn time? It is the time between substitution events, or the time one has to wait until the next change of character state occurs

0 0 1 1 0 0 2 Last sojourn time takes us beyond the end of the branch, so 2 is the final state 2 1st. sojourn time 2nd. sojourn time 3rd. sojourn 1st. waiting time time 2007 by Paul O. Lewis 2nd. waiting time 4th. sojourn time 3rd. waiting time 4th. waiting time Sojourn times are related to substitution rates. High substitution rates mean short sojourn times, on average. A dwell time is the total amount of time spent in one particular state. For example, the dwell time associated with state 0 is the sum of the 1st. and 3rd. sojourn times.

Substitution rates and sojourn times 0 Symmetric model for character with 3 states: 0, 1 and 2 The rate at which state 0 0 1 2 changes to something else (either state 1 or state 2) is 2β -2 β β β 1 2 β β -2 β β β -2 β Note that 2βt equals the branch length. For these symmetric models, the branch length is (k-1)βt where k is the number of states. 2007 by Paul O. Lewis

Sojourn times are exponentially distributed If λ is the total rate of leaving state 0, then you can simulate a soujourn by: drawing a time according to the cumulative distribution function of the exponential distribution (1 e λt ). pick the destination state by normalizing the rates away from state 0 such that the rates sum to one. This is picking j and t by: Pr(0 j, t) = Pr(mut. at t) Pr(j mut.) But we want quantities like: Pr(0 j, t 0 2, ν 1 )

Conditioning on the data We simulate according to Pr(0 j, t 0 2, ν 1 ) by rejection sampling: 1. Simulate a series of draws of the form: Pr(0 j, t) until the sum of t exceeds ν 1, 2. If we ended up in the correct state, then accept this series of sojourns as the mapping. 3. If not, go back to step 1.

"Mapping" edge 1 2007 by Paul O. Lewis 20

Failed mapping 2007 by Paul O. Lewis 21

Completed mapping 2007 by Paul O. Lewis 22

Stochastic character mapping or Mutational Mapping allows you to sample from the posterior distribution of histories for characters conditional on an assumed model conditional on the observed data implemented in Simmap (v1 by J. Bollback; v2 by J. Bollback and J. Yu) and Mesquite (module by P. Midford)

Stochastic character mapping You can calculate summary statistics on the set of simulated histories to answer almost any type of question that depends on knowing ancestral character states or the number of changes: What proportion of the evolutionary history was spent in state 0? What is the posterior probability that there were between 2 and 4 changes of state? Does the state in one character appear to affect the probability of change at another? Map both characters and do a contingency analysis. See Minin and Suchard (2008a,b); O Brein et al. (2009) for rejection free approaches to generate mappings.

Often we are more interested in the forces driving evolutionary patterns than the outcome. Perhaps we should focus on fitting a model of character evolution rather than the ancestral character state estimates.

Rather than this: 1. estimate a tree, 2. estimate character states, 3. figure out what model of character evolution best fits those ancestral character state reconstructions. We could do this: 1. estimate a tree, 2. figure out what model of character evolution best fits tip data Or even this: 1. Jointly estimate character evolution in trees in MCMC (marginalizing over trees when answering questions about character evolution).

Correlation of states in a discrete-state model character 1: character 2: species states change in character 1 branch changes #1 #2 change in character 2 #1 #2 Y N 0 6 Y 1 0 4 0 N 0 18 Week 8: Paired sites tests, gene frequencies, continuous characters p.35/44

Example from pp. 429-430 of the MacClade 4 manual. Character 1 Q: Does character 2 tend to change from yellow blue only when character 1 is in the blue state? 2007 by Paul O. Lewis Character 2

Concentrated Changes Test More precisely: how often would three yellow blue changes occur in the blue areas of the cladogram on the left if these 3 changes were thrown down at random on branches of the tree? Answer: 12.0671% of the time. Thus, we cannot reject the null hypothesis that the observed coincident changes in the two characters were simply the result of chance. 2007 by Paul O. Lewis

Model-based approach to detecting correlated evolution View the problem as model-selection or use LRT if you prefer the hypothesis testing framework. Consider two binary characters. there is only one rate: Under the simplest model, Why are there 0 s? 0 1 2 3 (0,0) (0,1) (1,0) (1,1) 0 (0,0) - q q 0 1 (0,1) q - 0 q 2 (1,0) q 0 - q 3 (1,1) 0 q q -

Under a model assuming character independence: 0 1 2 3 (0,0) (0,1) (1,0) (1,1) 0 (0,0) - q 1 q 1 0 1 (0,1) q 0-0 q 1 2 (1,0) q 0 0 - q 1 3 (1,1) 0 q 0 q 0 - Under a the most general model (letter indicates the character changing, subscripts=start state): 0 1 2 3 (0,0) (0,1) (1,0) (1,1) 0 (0,0) - s 00 f 00 0 1 (0,1) s 01-0 f 01 2 (1,0) f 10 0 - s 10 3 (1,1) 0 f 11 s 11 -

s 00 0,0 0,1 s 01 f 10 f 00 f 11 f 01 s 10 1,0 1,1 s 11

We can calculate maximum likelihood scores under any of these models and do model selection. Or we could put priors on the parameter values and using Bayesian model selection (e.g. reversible jump methods in BayesTraits).

References Minin, V. N. and Suchard, M. A. (2008a). Counting labeled transitions in continuous-time markov models of evolution. J. Math. Biol, 56:391 412. Minin, V. N. and Suchard, M. A. (2008b). Fast, accurate and simulation-free stochastic mapping. Philos Trans R Soc Lond, B, Biol Sci, 363:3985 3995. O Brein, J. D., Minin, V., and Suchard, M. (2009). Learning to count: Robust estimates for labeled distances between molecular sequences. Molecular Biology and Evolution, 26(4):801 814.