Accounting for Phylogenetic Uncertainty in Comparative Studies: MCMC and MCMCMC Approaches. Mark Pagel Reading University.

Size: px
Start display at page:

Download "Accounting for Phylogenetic Uncertainty in Comparative Studies: MCMC and MCMCMC Approaches. Mark Pagel Reading University."

Transcription

1 Accounting for Phylogenetic Uncertainty in Comparative Studies: MCMC and MCMCMC Approaches Mark Pagel Reading University Phylogeny of the Ascomycota Fungi showing the evolution of lichen-formation lichen forming ambiguous not lichen forming Mark Pagel, Reading Univ (ITP 5/14/01) 1

2 ¾ ¾ ½ ¾ ½ Accounting for Phylogenetic Uncertainty in Comparative Studies: MCMC and MCMCMC Approaches!" #$%'& ()& *+-,/ Ž D Ž D M < g š : < : M c œœ: < Dž Dž Ÿ Ÿ DD < : Ÿ Ÿ :D :F 8 c M < 5687:9<; 56879<; = >?8@BADC =/>?8@EAFCDGG<HD?IFJ KKMLL:NN<O<PBN OMPEN:KKRQQ FÀ<ÁÃÂFÄ DÀ<ÁÃÂFÄgÅÅÇÆ<È É ÁÊDËFÆ ÁÊDËDÆ<ÈÈ:ÂÌ8Á[Í:ËDÌ ÂÌ8Á[ÍËDÌ8ÄÄgÆ<È Â Æ<È QSNDN<TUQ QSNDN<TVQ W/X:Y[Z<\ W XY[Z<\E]]S^:Z ^ZR `acbdde `fagbdde'eeihkj hrj lnm:o lpm:orqq snt<uwv spt<uwv<xxzyy snt<uwv spt<uwv<xxzyy { g}}~} }: <ƒb MƒE R ˆ : <ŠŠc zœ cª «' ± ² ³p ²µ[ ¹pºp» ¼ Phylogenetic Uncertainty Three Species Rooted Trees A B C Unrooted Tree A B A B C C A C B Mark Pagel, Reading Univ (ITP 5/14/01) 2

3 Number of Possible Phylogenetic Trees Species Unrooted Rooted ,027,025 34,459,425 No. of Trees X X No. of tips (species) N=50 No. rooted = No. unrooted = Accounting for Phylogenetic Uncertainty Markov-Chain Monte Carlo (MCMC) Methods generate a long chain of phylogenetic trees (tree proposal and acceptance mechanisms) randomly sample from the converged chain calculate event or evolutionary process in each Tree acceptance mechanism: The Metropolis-Hastings Algorithm Accept new tree with p=1.0 if L(T n+1 ) > L(T n ) otherwise accept with probability L(T n+1 )/ L(T n ) Mark Pagel, Reading Univ (ITP 5/14/01) 3

4 Metropolis-Hastings Algorithm: Accept new tree according to: R = min 1, f ( X T' ) f (X T) x f (T' ) f (T) x f (T T' ) f (T' T) likelihood ratio prior ratio proposal ratio X=data (e.g., gene sequences) T=tree (topology, branches, parameters) MCMC Sampling Mark Pagel, Reading Univ (ITP 5/14/01) 4

5 Î Ð Ï Ñ Accounting for Phylogenetic Uncertainty in Comparative Studies: MCMC and MCMCMC Approaches Primer of finding the likelihood of a phylogenetic tree 1. aligned gene- sequence data Sheep ATGGTGAAAA GCCACATAGG CAGTTGGATC CTGGTTCTCT TTGTGGCCAT Human ATGGCGAA CCTTGG CTGCTGGATG CTGGTTCTCT TTGTGGCCAC Gorilla ATGGCGAA CCTTGG CTGCTGGATG CTGGTTCTCT TTGTGGCCAC Mink ATGGTGAAAA GCCACATAGG CAGCTGGCTC CTGGTTCTCT TTGTGGCCAC H G S M 2. model of sequence evolution 3. the probability of sequence substitutions in a given branch of the phylogeny = Q P(t) = Exp[Qt] 4. the likelihood of a given phylogenetic tree L = branches Π P(t) = Π Exp[Qt] 5. search alternative topologies H G S M H M S G Sheep Human Gorilla Mink ATGGTGAAAA GCCACATAGG CAGTTGGATC CTGGTTCTCT TTGTGGCCAT ATGGCGAA CCTTGG CTGCTGGATG CTGGTTCTCT TTGTGGCCAC ATGGCGAA CCTTGG CTGCTGGATG CTGGTTCTCT TTGTGGCCAC ATGGTGAAAA GCCACATAGG CAGCTGGCTC CTGGTTCTCT TTGTGGCCAC H G S M 4 s 1 s p i (x T,v,Q) = w root(i) p nk,x ki (v k,q) n k =1 s 2 k=1 p n' k,n k (v k,q) probability of observing ith nucleotide prior weight ith site possible assignments of ancestral nodes (e.g., 64) product over s branches leading to species product over s-2 internal branches L(x T,v,Q) = i p i (x T,v,Q) product over all i nucleotides Mark Pagel, Reading Univ (ITP 5/14/01) 5

6 Convergence of a Markov Chain loglikelihood of topology loglikelihood of topology Position in Markov Chain data: 20000Out1.lpd position in Markov chain 3500 Tree Likelihoods from Markov-Chain Monte Carlo Simulation n=54 taxa Ascomycota fungi n=10,000 trees frequency data: 2911 best.lpd log-likelihood of tree Mark Pagel, Reading Univ (ITP 5/14/01) 6

7 Character Transition Rates gains and losses of lichenization Single Tree* MCMC Integration ** gains (q 01 ) = 1.04 ( ) 1.47±0.32 losses (q 10 ) = 2.41( ) 2.12 ±0.31 *consensus tree **n=20,000 trees Mark Pagel, Reading Univ (ITP 5/14/01) 7

8 MCMC Some Issues Lack of convergence (poor mixing) Tree and parameter proposal mechanisms Tree and parameter updating schedules Detecting convergence Mark Pagel, Reading Univ (ITP 5/14/01) 8

9 Metropolis-Coupled Markov Chain Monte Carlo (MCMCMC) Given m simultaneous Markov chains, update chains, then swap states among a randomly chosen pair i and j each iteration according to: R = min 1, f i (y i ) f j (y j ) f i (x i ) f j (x j ) x i chain x j x k {likelihood ratio chain i * likelihood ratio chain j} y i y j y k probability of swapping with chain i = R * 2/m cold temperature hot Temperatures of heated chains cold chain t=0.2 t=0.5} 1/i 1/(1+t( i-1) number of chains, i Mark Pagel, Reading Univ (ITP 5/14/01) 9

10 generation log-likelihood 54 taxa Ascomycota data n= 858 nucleotides generation 54 taxa Ascomycota data n= 858 nucleotides log-likelihood Mark Pagel, Reading Univ (ITP 5/14/01) 10

11 Phylogeny of Human LINE-1 elements (92 elements, 4kb sequences) c22 b6 possum c21 b4 c22 b5 c21 c22 b3 b3 c1 c6 b3 b4 c22 c6 b2 b2 c22 b1 c6 b3 c1 b2 c1 mouse3 b1 C1.18 C1.20 c21 b1 C6.20 C22.18 C21.19R C1.19 C21.20 C22.13 C6.15 C22.14 C1.17 C6.19 C6.18 C21.17R C22.15 C6.16 C22.17 C1.16 C21.15R C1.15R C21.16 C6.17 C21.18 C6.12 C21.12 C1.13 C1.10 C1.14 C21.9 C1.12 C6.11 C1.11R C21.10 C6.10 C6.8 C22.8 C1.9 C21.6 C21.7 C21.8 C22.7 C6.9R C1.7 C1.8 C6.7 C21.5 C22.6 C21.4 C22.5 C C6.6 C22.4 C1.4 C1.5 C22.2 C22.3 C21.3 C6.5 C6.4 C1.3 C6.3R C1.2 C6.2 C21.2 C21.1 C1.1 L1 gorilla C22.1 B-globi C6.1 ~120 ~90 millions of years ago ~10-15 c21 b5 c6 b5 c21 b2 C22.11 C21.14 c6 b1 log-likelihood MCMCMC Analysis of LINEs data 92 LINE elements, 4000 nucleotides Ò Ó ÔgÕÃÓ Ùc Ò Ó ÔgÕÃÓ ØBÙ Ò Ó ÔgÕgÚc g Ò Ó ÔgÕgÚgÚgÙ Ò Ó ÔgÕgÚgÙc Ò Ó ÔgÕgÚšØBÙ Ò Ó ÔgÕgÔc g Ò Ó ÔgÕgÔgÚgÙ Ò Ó ÔgÕgÔgÙc Ò Ó ÔgÕgÔšØBÙ Ò Ó ÔgÕEÖš g 4-chains 1-chain ÙgÙc c g Ó gùc g c Ó ÙgÙš g c Úš EÙc c g ÚgÙcÙc g c Ôc EÙš g c generation Mark Pagel, Reading Univ (ITP 5/14/01) 11

12 log-likelihood LINEs data. ÒÔšØgØE g ÒÔšØBÛc g ÒÔcÜ'Ó g ÒÔcÜEÔc g ÒÔcÜEÙc g ÒÔcÜcØE g ÒÔcÜEÛc g ÒÔgÛÃÓ g ÒÔgÛgÔc g ÒÔgÛgÙc g Simultaneous chains with heating and swapping cold chain hot chain warm chain Chain swapping Ó Ùš g c ÔgÙš g g ÙgÙc c g ØEÙc c g ÛcÙc g c ÓgÓ Ùš g c Ó ÔgÙš g g generation LINEs data Log-likelihoods of trees from cold chain ( converged chain) Count Log-likelihoods pre-swap trees post-swap trees Mark Pagel, Reading Univ (ITP 5/14/01) 12

13 Phylogeny of Human LINE-1 elements (92 elements, 4kb sequences) c22 b6 possum c21 b4 c22 b5 c21 c22 b3 b3 c1 c6 b3 b4 c22 c6 b2 b2 c22 b1 c6 b3 c1 b2 c1 mouse3 b1 C1.18 C1.20 c21 b1 C6.20 C22.18 C21.19R C1.19 C21.20 C22.13 C C22.14 C1.17 C6.19 C6.18 C21.17R C22.15 C6.16 C21.15R C22.17 C1.16 C1.15R C21.16 C6.17 C21.12 C21.18 C C1.13 C1.10 C1.14 C21.9 C1.12 C6.11 C1.11R C21.10 C22.8 C6.10 C6.8 C1.9 C21.6 C21.7 C C22.7 C6.9R C1.7 C1.8 C6.7 C21.5 C22.6 C21.4 C22.5 C C6.6 C22.4 C1.4 C1.5 C22.2 C22.3 C21.3 C6.5 C6.4 C1.3 C6.3R C1.2 C6.2 C21.2 C21.1 C1.1 L1 gorilla C22.1 B-globi C6.1 ~120 ~90 millions of years ago ~10-15 c21 b5 c6 b5 c21 b2 C22.11 C21.14 c6 b1 Some topics and issues MCMC (getting stuck and slow progress) Tree proposal algorithms (random, deep, shallow,???) Alternation among suites of parameters (topology, branch lengths, model parameters) what schedule to use? Optimisation alternating with bouts of M-H selection? MCMCMC (highly inefficient) Tree swapping: encourage early tree swapping? Once converged, cool heated chains and use in inference? Mark Pagel, Reading Univ (ITP 5/14/01) 13

A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

More information

Bayesian Inference. Anders Gorm Pedersen. Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU)

Bayesian Inference. Anders Gorm Pedersen. Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) Background: Conditional probability A P (B A) = A,B P (A,

More information

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics Bayesian phylogenetics the one true tree? the methods we ve learned so far try to get a single tree that best describes the data however, they admit that they don t search everywhere, and that it is difficult

More information

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies 1 What is phylogeny? Essay written for the course in Markov Chains 2004 Torbjörn Karfunkel Phylogeny is the evolutionary development

More information

Infer relationships among three species: Outgroup:

Infer relationships among three species: Outgroup: Infer relationships among three species: Outgroup: Three possible trees (topologies): A C B A B C Model probability 1.0 Prior distribution Data (observations) probability 1.0 Posterior distribution Bayes

More information

Molecular Evolution & Phylogenetics

Molecular Evolution & Phylogenetics Molecular Evolution & Phylogenetics Heuristics based on tree alterations, maximum likelihood, Bayesian methods, statistical confidence measures Jean-Baka Domelevo Entfellner Learning Objectives know basic

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Estimating Evolutionary Trees. Phylogenetic Methods

Estimating Evolutionary Trees. Phylogenetic Methods Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent

More information

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem? Who was Bayes? Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 The Reverand Thomas Bayes was born in London in 1702. He was the

More information

Bayesian Phylogenetics

Bayesian Phylogenetics Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 Bayesian Phylogenetics 1 / 27 Who was Bayes? The Reverand Thomas Bayes was born

More information

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X

More information

Parallel Tempering I

Parallel Tempering I Parallel Tempering I this is a fancy (M)etropolis-(H)astings algorithm it is also called (M)etropolis (C)oupled MCMC i.e. MCMCMC! (as the name suggests,) it consists of running multiple MH chains in parallel

More information

Discrete & continuous characters: The threshold model

Discrete & continuous characters: The threshold model Discrete & continuous characters: The threshold model Discrete & continuous characters: the threshold model So far we have discussed continuous & discrete character models separately for estimating ancestral

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

Bayesian Models for Phylogenetic Trees

Bayesian Models for Phylogenetic Trees Bayesian Models for Phylogenetic Trees Clarence Leung* 1 1 McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada ABSTRACT Introduction: Inferring genetic ancestry of different species

More information

ETIKA V PROFESII PSYCHOLÓGA

ETIKA V PROFESII PSYCHOLÓGA P r a ž s k á v y s o k á š k o l a p s y c h o s o c i á l n í c h s t u d i í ETIKA V PROFESII PSYCHOLÓGA N a t á l i a S l o b o d n í k o v á v e d ú c i p r á c e : P h D r. M a r t i n S t r o u

More information

A (short) introduction to phylogenetics

A (short) introduction to phylogenetics A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Mixture Models in Phylogenetic Inference. Mark Pagel and Andrew Meade Reading University.

Mixture Models in Phylogenetic Inference. Mark Pagel and Andrew Meade Reading University. Mixture Models in Phylogenetic Inference Mark Pagel and Andrew Meade Reading University m.pagel@rdg.ac.uk Mixture models in phylogenetic inference!some background statistics relevant to phylogenetic inference!mixture

More information

Bayesian Estimation of Ancestral Character States on Phylogenies

Bayesian Estimation of Ancestral Character States on Phylogenies Syst. Biol. 53(5):673 684, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490522232 Bayesian Estimation of Ancestral Character States on

More information

Lecture 6 Phylogenetic Inference

Lecture 6 Phylogenetic Inference Lecture 6 Phylogenetic Inference From Darwin s notebook in 1837 Charles Darwin Willi Hennig From The Origin in 1859 Cladistics Phylogenetic inference Willi Hennig, Cladistics 1. Clade, Monophyletic group,

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Approximate Bayesian Computation: a simulation based approach to inference

Approximate Bayesian Computation: a simulation based approach to inference Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics

More information

Bayesian Phylogenetics:

Bayesian Phylogenetics: Bayesian Phylogenetics: an introduction Marc A. Suchard msuchard@ucla.edu UCLA Who is this man? How sure are you? The one true tree? Methods we ve learned so far try to find a single tree that best describes

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

One-minute responses. Nice class{no complaints. Your explanations of ML were very clear. The phylogenetics portion made more sense to me today.

One-minute responses. Nice class{no complaints. Your explanations of ML were very clear. The phylogenetics portion made more sense to me today. One-minute responses Nice class{no complaints. Your explanations of ML were very clear. The phylogenetics portion made more sense to me today. The pace/material covered for likelihoods was more dicult

More information

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004, Tracing the Evolution of Numerical Phylogenetics: History, Philosophy, and Significance Adam W. Ferguson Phylogenetic Systematics 26 January 2009 Inferring Phylogenies Historical endeavor Darwin- 1837

More information

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression) Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures

More information

Quantifying Uncertainty

Quantifying Uncertainty Sai Ravela M. I. T Last Updated: Spring 2013 1 Markov Chain Monte Carlo Monte Carlo sampling made for large scale problems via Markov Chains Monte Carlo Sampling Rejection Sampling Importance Sampling

More information

Framework for functional tree simulation applied to 'golden delicious' apple trees

Framework for functional tree simulation applied to 'golden delicious' apple trees Purdue University Purdue e-pubs Open Access Theses Theses and Dissertations Spring 2015 Framework for functional tree simulation applied to 'golden delicious' apple trees Marek Fiser Purdue University

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

T i t l e o f t h e w o r k : L a M a r e a Y o k o h a m a. A r t i s t : M a r i a n o P e n s o t t i ( P l a y w r i g h t, D i r e c t o r )

T i t l e o f t h e w o r k : L a M a r e a Y o k o h a m a. A r t i s t : M a r i a n o P e n s o t t i ( P l a y w r i g h t, D i r e c t o r ) v e r. E N G O u t l i n e T i t l e o f t h e w o r k : L a M a r e a Y o k o h a m a A r t i s t : M a r i a n o P e n s o t t i ( P l a y w r i g h t, D i r e c t o r ) C o n t e n t s : T h i s w o

More information

CS 188: Artificial Intelligence. Bayes Nets

CS 188: Artificial Intelligence. Bayes Nets CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

Bayesian Classification and Regression Trees

Bayesian Classification and Regression Trees Bayesian Classification and Regression Trees James Cussens York Centre for Complex Systems Analysis & Dept of Computer Science University of York, UK 1 Outline Problems for Lessons from Bayesian phylogeny

More information

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Note 2: Paul Lewis has written nice software for demonstrating Markov

More information

CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES

CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES INTRODUCTION CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES This worksheet complements the Click and Learn developed in conjunction with the 2011 Holiday Lectures on Science, Bones, Stones, and Genes:

More information

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D 7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

Phylogeny. November 7, 2017

Phylogeny. November 7, 2017 Phylogeny November 7, 2017 Phylogenetics Phylon = tribe/race, genetikos = relative to birth Phylogenetics: study of evolutionary relationships among organisms, sequences, or anything in between Related

More information

Rapid evolution of the cerebellum in humans and other great apes

Rapid evolution of the cerebellum in humans and other great apes Rapid evolution of the cerebellum in humans and other great apes Article Accepted Version Barton, R. A. and Venditti, C. (2014) Rapid evolution of the cerebellum in humans and other great apes. Current

More information

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega BLAST Multiple Sequence Alignments: Clustal Omega What does basic BLAST do (e.g. what is input sequence and how does BLAST look for matches?) Susan Parrish McDaniel College Multiple Sequence Alignments

More information

Markov chain Monte-Carlo to estimate speciation and extinction rates: making use of the forest hidden behind the (phylogenetic) tree

Markov chain Monte-Carlo to estimate speciation and extinction rates: making use of the forest hidden behind the (phylogenetic) tree Markov chain Monte-Carlo to estimate speciation and extinction rates: making use of the forest hidden behind the (phylogenetic) tree Nicolas Salamin Department of Ecology and Evolution University of Lausanne

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Bayes Nets: Sampling Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Evolutionary trees. Describe the relationship between objects, e.g. species or genes Evolutionary trees Bonobo Chimpanzee Human Neanderthal Gorilla Orangutan Describe the relationship between objects, e.g. species or genes Early evolutionary studies The evolutionary relationships between

More information

HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM

HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM I529: Machine Learning in Bioinformatics (Spring 2017) HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

C.DARWIN ( )

C.DARWIN ( ) C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 Sequential parallel tempering With the development of science and technology, we more and more need to deal with high dimensional systems. For example, we need to align a group of protein or DNA sequences

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM

3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM I529: Machine Learning in Bioinformatics (Spring 2017) Content HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University,

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

Consensus methods. Strict consensus methods

Consensus methods. Strict consensus methods Consensus methods A consensus tree is a summary of the agreement among a set of fundamental trees There are many consensus methods that differ in: 1. the kind of agreement 2. the level of agreement Consensus

More information

Lecture 12: Bayesian phylogenetics and Markov chain Monte Carlo Will Freyman

Lecture 12: Bayesian phylogenetics and Markov chain Monte Carlo Will Freyman IB200, Spring 2016 University of California, Berkeley Lecture 12: Bayesian phylogenetics and Markov chain Monte Carlo Will Freyman 1 Basic Probability Theory Probability is a quantitative measurement of

More information

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley

PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION Integrative Biology 200B Spring 2009 University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley B.D. Mishler Jan. 22, 2009. Trees I. Summary of previous lecture: Hennigian

More information

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Phylogenetic analyses. Kirsi Kostamo

Phylogenetic analyses. Kirsi Kostamo Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,

More information

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016 Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016 By Philip J. Bergmann 0. Laboratory Objectives 1. Learn what Bayes Theorem and Bayesian Inference are 2. Reinforce the properties

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

More information

Inferring Speciation Times under an Episodic Molecular Clock

Inferring Speciation Times under an Episodic Molecular Clock Syst. Biol. 56(3):453 466, 2007 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150701420643 Inferring Speciation Times under an Episodic Molecular

More information

Inferring Molecular Phylogeny

Inferring Molecular Phylogeny Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction

More information

Juan Juan Salon. EH National Bank. Sandwich Shop Nail Design. OSKA Beverly. Chase Bank. Marina Rinaldi. Orogold. Mariposa.

Juan Juan Salon. EH National Bank. Sandwich Shop Nail Design. OSKA Beverly. Chase Bank. Marina Rinaldi. Orogold. Mariposa. ( ) X é X é Q Ó / 8 ( ) Q / ( ) ( ) : ( ) : 44-3-8999 433 4 z 78-19 941, #115 Z 385-194 77-51 76-51 74-7777, 75-5 47-55 74-8141 74-5115 78-3344 73-3 14 81-4 86-784 78-33 551-888 j 48-4 61-35 z/ zz / 138

More information

Phylogenetics in the Age of Genomics: Prospects and Challenges

Phylogenetics in the Age of Genomics: Prospects and Challenges Phylogenetics in the Age of Genomics: Prospects and Challenges Antonis Rokas Department of Biological Sciences, Vanderbilt University http://as.vanderbilt.edu/rokaslab http://pubmed2wordle.appspot.com/

More information

Phylogeny: building the tree of life

Phylogeny: building the tree of life Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan

More information

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5. Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony

More information

Bayesian Phylogenetics

Bayesian Phylogenetics Bayesian Phylogenetics Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut Woods Hole Molecular Evolution Workshop, July 27, 2006 2006 Paul O. Lewis Bayesian Phylogenetics

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Phylogenetics: Parsimony and Likelihood. COMP Spring 2016 Luay Nakhleh, Rice University

Phylogenetics: Parsimony and Likelihood. COMP Spring 2016 Luay Nakhleh, Rice University Phylogenetics: Parsimony and Likelihood COMP 571 - Spring 2016 Luay Nakhleh, Rice University The Problem Input: Multiple alignment of a set S of sequences Output: Tree T leaf-labeled with S Assumptions

More information

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22. Introduction to Artificial Intelligence V22.0472-001 Fall 2009 Lecture 15: Bayes Nets 3 Midterms graded Assignment 2 graded Announcements Rob Fergus Dept of Computer Science, Courant Institute, NYU Slides

More information

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information # Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either

More information

Bayesian Estimation of Concordance among Gene Trees

Bayesian Estimation of Concordance among Gene Trees University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Faculty Publications in the Biological Sciences Papers in the Biological Sciences 2007 Bayesian Estimation of Concordance

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,

More information

Phylogenetics: Parsimony

Phylogenetics: Parsimony 1 Phylogenetics: Parsimony COMP 571 Luay Nakhleh, Rice University he Problem 2 Input: Multiple alignment of a set S of sequences Output: ree leaf-labeled with S Assumptions Characters are mutually independent

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

Tutorial on ABC Algorithms

Tutorial on ABC Algorithms Tutorial on ABC Algorithms Dr Chris Drovandi Queensland University of Technology, Australia c.drovandi@qut.edu.au July 3, 2014 Notation Model parameter θ with prior π(θ) Likelihood is f(ý θ) with observed

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

An Investigation of Phylogenetic Likelihood Methods

An Investigation of Phylogenetic Likelihood Methods An Investigation of Phylogenetic Likelihood Methods Tiffani L. Williams and Bernard M.E. Moret Department of Computer Science University of New Mexico Albuquerque, NM 87131-1386 Email: tlw,moret @cs.unm.edu

More information

A Bayesian Analysis of Metazoan Mitochondrial Genome Arrangements

A Bayesian Analysis of Metazoan Mitochondrial Genome Arrangements A Bayesian Analysis of Metazoan Mitochondrial Genome Arrangements Bret Larget,* Donald L. Simon, Joseph B. Kadane,à and Deborah Sweet 1 *Departments of Botany and of Statistics, University of Wisconsin

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

Bayes Nets: Sampling

Bayes Nets: Sampling Bayes Nets: Sampling [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] Approximate Inference:

More information

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010 Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Evolutionary Tree Analysis. Overview

Evolutionary Tree Analysis. Overview CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

Theory of Evolution Charles Darwin

Theory of Evolution Charles Darwin Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

More information

Multiple Sequence Alignment. Sequences

Multiple Sequence Alignment. Sequences Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe

More information

How should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?

How should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe? How should we go about modeling this? gorilla GAAGTCCTTGAGAAATAAACTGCACACACTGG orangutan GGACTCCTTGAGAAATAAACTGCACACACTGG Model parameters? Time Substitution rate Can we observe time or subst. rate? What

More information

HIGH PERFORMANCE, BAYESIAN BASED PHYLOGENETIC INFERENCE FRAMEWORK

HIGH PERFORMANCE, BAYESIAN BASED PHYLOGENETIC INFERENCE FRAMEWORK HIGH PERFORMANCE, BAYESIAN BASED PHYLOGENETIC INFERENCE FRAMEWORK By Xizhou Feng Bachelor of Engineering China Textile University, 1993 Master of Science Tsinghua University, 1996 Submitted in Partial

More information

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057 Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number

More information

Introduction to characters and parsimony analysis

Introduction to characters and parsimony analysis Introduction to characters and parsimony analysis Genetic Relationships Genetic relationships exist between individuals within populations These include ancestordescendent relationships and more indirect

More information

Multimodal Nested Sampling

Multimodal Nested Sampling Multimodal Nested Sampling Farhan Feroz Astrophysics Group, Cavendish Lab, Cambridge Inverse Problems & Cosmology Most obvious example: standard CMB data analysis pipeline But many others: object detection,

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Phylogeny: traditional and Bayesian approaches

Phylogeny: traditional and Bayesian approaches Phylogeny: traditional and Bayesian approaches 5-Feb-2014 DEKM book Notes from Dr. B. John Holder and Lewis, Nature Reviews Genetics 4, 275-284, 2003 1 Phylogeny A graph depicting the ancestor-descendent

More information

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 REDCLIFF MUNICIPAL PLANNING COMMISSION FOR COMMENT/DISCUSSION DATE: TOPIC: April 27 th, 2018 Bylaw 1860/2018, proposed amendments to the Land Use Bylaw regarding cannabis

More information