# Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Size: px
Start display at page:

Download "Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!"

Transcription

1 Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site:

2 y = x x x x x Why do we need models? y = x Copyright 2007 Paul O. Lewis 2

3 Models Models help us intelligently interpolate between our observations for purposes of making predictions Adding parameters to a model generally increases its fit to the data Underparameterized models lead to poor fit to observed data points Overparameterized models lead to poor prediction of future observations Criteria for choosing models include likelihood ratio tests, AIC, BIC, Bayes Factors, etc. all provide a way to choose a model that is neither underparameterized nor overparameterized Copyright 2007 Paul O. Lewis 3

4 The Poisson distribution Probability distribution on the number of events when: 1. events are assumed to be independent, 2. the rate of events some constant, µ, and 3. the process continues for some duration of time, t. The expectation of the number of events is ν = µt. Note that ν can be any non-negative number, but the Poisson is a discrete distribution it gives the probabilities of the number of events (and this number will always be a non-negative integer).

5 The Poisson distribution Pr(k events Expected # is ν) = νk e ν k! Pr(0 events) = ν0 e ν 0! = e ν = e µt Pr( 1 events) = 1 e ν = 1 e µt

6 "Disruptions" vs. substitutions When a disruption occurs, any base can appear in a sequence. T? Note: disruption is my term for this make-believe event. You will not see this term in the literature. If the base that A appears is different C G from the base that was already there, then a substitution event has occurred. T The rate at which any particular substitution occurs will be 1/4 the disruption rate (assuming equal base frequencies) Copyright 2007 Paul O. Lewis 13

7 Probability of T G over time t If µ is the rate of disruptions, and a branch is t units of time long then: Let s use θ for the rate of any particular disruption. µ T A = µ T C = µ T G = µ T T = θ µ = 4θ Furthermore, given that there is a disruption the chance of any particular change is 1 4

8 Probability of T G over time t Pr(0 disruptions t) = e µt Pr(at least 1 disruption t) = 1 e µt Pr(last disruption leads to G) = 0.25 Pr(T G t) = 0.25 ( 1 e µt) = 0.25 ( 1 e 4θt)

9 JC69 model Bases are assumed to be equally frequent (all 0.25) Assumes rate of substitution (α) is the same for all possible substitutions Usually described as a 1-parameter model (the parameter being α) Remember, however, that each edge in a tree can have its own α, so there are really as many parameters in the model as there are edges in the tree! Jukes, T. H., and C. R. Cantor Evolution of protein molecules. Pages in H. N. Munro (ed.), Mammalian Protein Metabolism. Academic Press, New York. Copyright 2007 Paul O. Lewis 15

10 JC transition probabilities Pr(T A t) = 0.25 ( 1 e 4θt) Pr(T C t) = 0.25 ( 1 e 4θt) Pr(T G t) = 0.25 ( 1 e 4θt) Pr(T T t) = 0.25 ( 1 e 4θt) but this only adds up to: ( 1 e 4θt ) instead of 1!

11 We left out the probability of no disruptions:e 4θt So: Pr(T A t) = 0.25 ( 1 e 4θt) Pr(T C t) = 0.25 ( 1 e 4θt) Pr(T G t) = 0.25 ( 1 e 4θt) Pr(T T t) = e 4θt ( 1 e 4θt) = e 4θt

12 JC transition probabilities Pr(i j t) = 0.25 ( 1 e 4θt) Pr(i i t) = e 4θt When t = 0, then e 4θt = 1, and: Pr(i j t) = 0 Pr(i i t) = 1

13 JC transition probabilities Pr(i j t) = 0.25 ( 1 e 4θt) Pr(i i t) = e 4θt When t =, then e 4θt = 0, and: Pr(i j t) = 0.25 Pr(i i t) = 0.25

14 Probability of A present as a function of time Upper curve assumes we started with A at time 0. Over time, the probability of still seeing an A at this site drops because rate of changing to one of the other three bases is 3α (so rate of staying the same is -3α). The equilibrium relative frequency of A is 0.25 Lower curve assumes we started with some state other than A (T is used here). Over time, the probability of seeing an A at this site grows because the rate at which the current base will change into an A is α. Copyright 2007 Paul O. Lewis 24

15 Water analogy (time 0) 3α A C G T Start with container A completely full and others empty Imagine that all containers are connected by tubes that allow same rate of flow between any two Initially, A will be losing water at 3 times the rate that C (or G or T) gains water α Copyright 2007 Paul O. Lewis 25

16 Water analogy (after some time) A C G T A s level is not dropping as fast now because it is now also receiving water from C, G and T Copyright 2007 Paul O. Lewis 26

17 Water analogy (after a very long time) A C G T Eventually, all containers are one fourth full and there is zero net volume change stationarity (equilibrium) has been achieved (Thanks to Kent Holsinger for this analogy) Copyright 2007 Paul O. Lewis 27

18 JC instantaneous rate matrix - the Q matrix for JC The 1 parameter is α (sometimes parameterized in terms of µ). This is the rate of replacements ( disruptions that change the state): To State A C G T From A 3α α α α C α 3α α α State G α α 3α α T α α α 3α

19 Change probabilities We can calculate a transition probability matrix as a function of time by: P(t) = e Qt The important thing to note is the rates (Q matrix) is multiplied by the time. We can t separate rates and times since we always see the effect of their product. Is a medium level of character divergence: 1. medium rate of change and medium amount of time, 2. high rate, but short time period, 3. low rate, but a long time period?

20 JC instantaneous rate matrix again What if you do not know the length of time for a branch in the tree? We estimate branch lengths in terms of character divergence the product of rate and time. What is important is that we know the relative rates of different types of substitutions, so JC can be expressed: To State A C G T From A C State G T

21 JC instantaneous rate matrix yet again We estimate branch lengths in terms of expected number of changes per site. To do this we standardize the total rate of divergence in the Q matrix and estimate ν = µt = 3αt for each branch. From A C State G 1 3 T 1 3 To State A C G T

22 ? model or the K80 model Transitions and transversions occur at different rates: To State A C G T From A 2β α β α β C β 2β α β α State G α β 2β α β T β α β 2β α

23 ? model or the K80 model. Reparameterized. Once again, we care only about the relative rates, so we can choose one rate to be frame of reference. This turns the 2 parameter model into a 1 parameter form: From State To State A C G T A (2 + κ)β β κβ β C β (2 + κ) β κβ G κβ β (2 + κ) β T β κβ β (2 + κ)

24 ? model or the K80 model. Reparameterized again. To State A C G T From A 2 κ 1 κ 1 C 1 2 κ 1 κ State G κ 1 2 κ 1 T 1 κ 1 2 κ

25 Kappa is the transititon/transversion rate ratio: κ = α β (if κ = 1 then we are back to JC).

26 What is the instantaneous probability of an particular transversion? Pr(A C) = Pr(A) Pr(change to C) = 1 4 (βdt)

27 What is the instantaneous probability of an particular transition? Pr(A G) = Pr(A) Pr(change to G) = 1 4 (κβdt)

28 There are four types of transitions: A G, G A, C T, T C and eight types of transversions: A C, A T, G C, G T, C A, C G, T A, T G Ti/Tv ratio = Pr(any transition) Pr(any transversion) = 4 ( 1 4 (κβdt)) 8 ( 1 4 (βdt)) = κ 2 For K2P instantaneous transition/transversion ratio is one-half the instantaneous transition/transversion rate ratio

29 Felsenstein 1981 model or F81 model To State A C G T From A π C π G π T C π State A π G π T G π A π C π T T π A π C π G

30 HKY 1985 model To State A C G T From A π C κπ G π T C π State A π G κπ T G κπ A π C π T T π A κπ C π G

31 F84 model: F84* vs. HKY85 μ rate of process generating all types of substitutions kμ rate of process generating only transitions Becomes F81 model if k = 0 HKY85 model: β rate of process generating only transversions κβ rate of process generating only transitions Becomes F81 model if κ = 1 *First used in PHYLIP in 1984, first published bykishino, H., and M. Hasegawa Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea. Journal of Molecular Evolution 29: Copyright 2007 Paul O. Lewis 38

32 General Time Reversible GTR model To State A C G T From A aπ C bπ G cπ T C aπ State A dπ G eπ T G bπ A dπ C fπ T T cπ A eπ C fπ G In PAUP, f = 1 indicating that G T is the reference rate

33 References Kimura, M. (1980). A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16:

### Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

### How should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?

How should we go about modeling this? gorilla GAAGTCCTTGAGAAATAAACTGCACACACTGG orangutan GGACTCCTTGAGAAATAAACTGCACACACTGG Model parameters? Time Substitution rate Can we observe time or subst. rate? What

### Substitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A

GAGATC 3:G A 6:C T Common Ancestor ACGATC 1:A G 2:C A Substitution = Mutation followed 5:T C by Fixation GAAATT 4:A C 1:G A AAAATT GAAATT GAGCTC ACGACC Chimp Human Gorilla Gibbon AAAATT GAAATT GAGCTC ACGACC

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics June 1, 2009 Smithsonian Workshop on Molecular Evolution Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut, Storrs, CT Copyright 2009

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 26 January 2011 Workshop on Molecular Evolution Český Krumlov, Česká republika Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut,

### Mutation models I: basic nucleotide sequence mutation models

Mutation models I: basic nucleotide sequence mutation models Peter Beerli September 3, 009 Mutations are irreversible changes in the DNA. This changes may be introduced by chance, by chemical agents, or

### What Is Conservation?

What Is Conservation? Lee A. Newberg February 22, 2005 A Central Dogma Junk DNA mutates at a background rate, but functional DNA exhibits conservation. Today s Question What is this conservation? Lee A.

### Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

### Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22

Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 24. Phylogeny methods, part 4 (Models of DNA and

### Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26

Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 4 (Models of DNA and

### Nucleotide substitution models

Nucleotide substitution models Alexander Churbanov University of Wyoming, Laramie Nucleotide substitution models p. 1/23 Jukes and Cantor s model [1] The simples symmetrical model of DNA evolution All

### Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction

### Maximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018

Maximum Likelihood Tree Estimation Carrie Tribble IB 200 9 Feb 2018 Outline 1. Tree building process under maximum likelihood 2. Key differences between maximum likelihood and parsimony 3. Some fancy extras

### Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

### Molecular Evolution, course # Final Exam, May 3, 2006

Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least

### Phylogenetics: Building Phylogenetic Trees

1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 29 July 2014 Workshop on Molecular Evolution Woods Hole, Massachusetts Paul O. Lewis Department of Ecology & Evolutionary Biology Paul O. Lewis (2014 Woods Hole Workshop

### Maximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington.

Maximum Likelihood This presentation is based almost entirely on Peter G. Fosters - "The Idiot s Guide to the Zen of Likelihood in a Nutshell in Seven Days for Dummies, Unleashed. http://www.bioinf.org/molsys/data/idiots.pdf

### Lecture 4. Models of DNA and protein change. Likelihood methods

Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/36

### Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 23 July 2013 Workshop on Molecular Evolution Woods Hole, Massachusetts Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut, Storrs,

### Likelihood in Phylogenetics

Likelihood in Phylogenetics 22 July 2017 Workshop on Molecular Evolution Woods Hole, Mass. Paul O. Lewis Department of Ecology & Evolutionary Biology Paul O. Lewis (2017 Woods Hole Workshop in Molecular

### Lab 9: Maximum Likelihood and Modeltest

Integrative Biology 200A University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS" Spring 2010 Updated by Nick Matzke Lab 9: Maximum Likelihood and Modeltest In this lab we re going to use PAUP*

### Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary

### Sequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment

Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the log-likelihood ratio of

### Lecture Notes: Markov chains

Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 28 January 2015 Paul O. Lewis Department of Ecology & Evolutionary Biology Workshop on Molecular Evolution Český Krumlov Paul O. Lewis (2015 Czech Republic Workshop

### Genetic distances and nucleotide substitution models

4 Genetic distances and nucleotide substitution models THEORY Korbinian Strimmer and Arndt von Haeseler 4.1 Introduction One of the first steps in the analysis of aligned nucleotide or amino acid sequences

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 21 July 2015 Workshop on Molecular Evolution Woods Hole, Mass. Paul O. Lewis Department of Ecology & Evolutionary Biology Paul O. Lewis (2015 Woods Hole Workshop in

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 21 July 2015 Workshop on Molecular Evolution Woods Hole, Mass. Paul O. Lewis Department of Ecology & Evolutionary Biology Paul O. Lewis (2015 Woods Hole Workshop in

### Phylogenetics. Andreas Bernauer, March 28, Expected number of substitutions using matrix algebra 2

Phylogenetics Andreas Bernauer, andreas@carrot.mcb.uconn.edu March 28, 2004 Contents 1 ts:tr rate ratio vs. ts:tr ratio 1 2 Expected number of substitutions using matrix algebra 2 3 Why the GTR model can

### Inferring Molecular Phylogeny

Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction

### Letter to the Editor. Department of Biology, Arizona State University

Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona

Preliminaries Download PAUP* from: http://people.sc.fsu.edu/~dswofford/paup_test 1 A model of the Boston T System 1 Idea from Paul Lewis A simpler model? 2 Why do models matter? Model-based methods including

### 1. Can we use the CFN model for morphological traits?

1. Can we use the CFN model for morphological traits? 2. Can we use something like the GTR model for morphological traits? 3. Stochastic Dollo. 4. Continuous characters. Mk models k-state variants of the

### Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood

Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood For: Prof. Partensky Group: Jimin zhu Rama Sharma Sravanthi Polsani Xin Gong Shlomit klopman April. 7. 2003 Table of Contents Introduction...3

### Taming the Beast Workshop

Workshop David Rasmussen & arsten Magnus June 27, 2016 1 / 31 Outline of sequence evolution: rate matrices Markov chain model Variable rates amongst different sites: +Γ Implementation in BES2 2 / 31 genotype

### BMI/CS 776 Lecture 4. Colin Dewey

BMI/CS 776 Lecture 4 Colin Dewey 2007.02.01 Outline Common nucleotide substitution models Directed graphical models Ancestral sequence inference Poisson process continuous Markov process X t0 X t1 X t2

### Inferring Complex DNA Substitution Processes on Phylogenies Using Uniformization and Data Augmentation

Syst Biol 55(2):259 269, 2006 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 101080/10635150500541599 Inferring Complex DNA Substitution Processes on Phylogenies

### Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley. Parsimony & Likelihood [draft]

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley K.W. Will Parsimony & Likelihood [draft] 1. Hennig and Parsimony: Hennig was not concerned with parsimony

### Chapter 7: Models of discrete character evolution

Chapter 7: Models of discrete character evolution pdf version R markdown to recreate analyses Biological motivation: Limblessness as a discrete trait Squamates, the clade that includes all living species

### Evolutionary Analysis of Viral Genomes

University of Oxford, Department of Zoology Evolutionary Biology Group Department of Zoology University of Oxford South Parks Road Oxford OX1 3PS, U.K. Fax: +44 1865 271249 Evolutionary Analysis of Viral

### KaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging

Method KaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging Zhang Zhang 1,2,3#, Jun Li 2#, Xiao-Qian Zhao 2,3, Jun Wang 1,2,4, Gane Ka-Shu Wong 2,4,5, and Jun Yu 1,2,4 * 1

### Edward Susko Department of Mathematics and Statistics, Dalhousie University. Introduction. Installation

1 dist est: Estimation of Rates-Across-Sites Distributions in Phylogenetic Subsititution Models Version 1.0 Edward Susko Department of Mathematics and Statistics, Dalhousie University Introduction The

### In: P. Lemey, M. Salemi and A.-M. Vandamme (eds.). To appear in: The. Chapter 4. Nucleotide Substitution Models

In: P. Lemey, M. Salemi and A.-M. Vandamme (eds.). To appear in: The Phylogenetic Handbook. 2 nd Edition. Cambridge University Press, UK. (final version 21. 9. 2006) Chapter 4. Nucleotide Substitution

### Week 5: Distance methods, DNA and protein models

Week 5: Distance methods, DNA and protein models Genome 570 February, 2016 Week 5: Distance methods, DNA and protein models p.1/69 A tree and the expected distances it predicts E A 0.08 0.05 0.06 0.03

### Reconstruire le passé biologique modèles, méthodes, performances, limites

Reconstruire le passé biologique modèles, méthodes, performances, limites Olivier Gascuel Centre de Bioinformatique, Biostatistique et Biologie Intégrative C3BI USR 3756 Institut Pasteur & CNRS Reconstruire

### In: M. Salemi and A.-M. Vandamme (eds.). To appear. The. Phylogenetic Handbook. Cambridge University Press, UK.

In: M. Salemi and A.-M. Vandamme (eds.). To appear. The Phylogenetic Handbook. Cambridge University Press, UK. Chapter 4. Nucleotide Substitution Models THEORY Korbinian Strimmer () and Arndt von Haeseler

### Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).

1 Bioinformatics: In-depth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff

### 7. Tests for selection

Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info

### Bayesian Phylogenetics

Bayesian Phylogenetics Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut Woods Hole Molecular Evolution Workshop, July 27, 2006 2006 Paul O. Lewis Bayesian Phylogenetics

### Lecture 4. Models of DNA and protein change. Likelihood methods

Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/39

### Phylogenetic Inference using RevBayes

Phylogenetic Inference using RevBayes Substitution Models Sebastian Höhna 1 Overview This tutorial demonstrates how to set up and perform analyses using common nucleotide substitution models. The substitution

### POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

### Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number

### Counting phylogenetic invariants in some simple cases. Joseph Felsenstein. Department of Genetics SK-50. University of Washington

Counting phylogenetic invariants in some simple cases Joseph Felsenstein Department of Genetics SK-50 University of Washington Seattle, Washington 98195 Running Headline: Counting Phylogenetic Invariants

### Phylogenetic Gibbs Recursive Sampler. Locating Transcription Factor Binding Sites

A for Locating Transcription Factor Binding Sites Sean P. Conlan 1 Lee Ann McCue 2 1,3 Thomas M. Smith 3 William Thompson 4 Charles E. Lawrence 4 1 Wadsworth Center, New York State Department of Health

### Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

### Pinvar approach. Remarks: invariable sites (evolve at relative rate 0) variable sites (evolves at relative rate r)

Pinvar approach Unlike the site-specific rates approach, this approach does not require you to assign sites to rate categories Assumes there are only two classes of sites: invariable sites (evolve at relative

### C.DARWIN ( )

C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships

### Improving divergence time estimation in phylogenetics: more taxa vs. longer sequences

Mathematical Statistics Stockholm University Improving divergence time estimation in phylogenetics: more taxa vs. longer sequences Bodil Svennblad Tom Britton Research Report 2007:2 ISSN 650-0377 Postal

### Phylogenetic Inference using RevBayes

Phylogenetic Inference using RevBayes Model section using Bayes factors Sebastian Höhna 1 Overview This tutorial demonstrates some general principles of Bayesian model comparison, which is based on estimating

### Phylogenetic Tree Reconstruction

I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

### Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

### Molecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences

Molecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences Basic Bioinformatics Workshop, ILRI Addis Ababa, 12 December 2017 1 Learning Objectives

### A Phylogenetic Gibbs Recursive Sampler for Locating Transcription Factor Binding Sites

A for Locating Transcription Factor Binding Sites Sean P. Conlan 1 Lee Ann McCue 2 1,3 Thomas M. Smith 3 William Thompson 4 Charles E. Lawrence 4 1 Wadsworth Center, New York State Department of Health

### Evolutionary Models. Evolutionary Models

Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment

### Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

### Efficiencies of maximum likelihood methods of phylogenetic inferences when different substitution models are used

Molecular Phylogenetics and Evolution 31 (2004) 865 873 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev Efficiencies of maximum likelihood methods of phylogenetic inferences when different

### Lie Markov models. Jeremy Sumner. School of Physical Sciences University of Tasmania, Australia

Lie Markov models Jeremy Sumner School of Physical Sciences University of Tasmania, Australia Stochastic Modelling Meets Phylogenetics, UTAS, November 2015 Jeremy Sumner Lie Markov models 1 / 23 The theory

### π b = a π a P a,b = Q a,b δ + o(δ) = 1 + Q a,a δ + o(δ) = I 4 + Qδ + o(δ),

ABC estimation of the scaled effective population size. Geoff Nicholls, DTC 07/05/08 Refer to http://www.stats.ox.ac.uk/~nicholls/dtc/tt08/ for material. We will begin with a practical on ABC estimation

### "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

### A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

### Natural selection on the molecular level

Natural selection on the molecular level Fundamentals of molecular evolution How DNA and protein sequences evolve? Genetic variability in evolution } Mutations } forming novel alleles } Inversions } change

### Introduction to MEGA

Introduction to MEGA Download at: http://www.megasoftware.net/index.html Thomas Randall, PhD tarandal@email.unc.edu Manual at: www.megasoftware.net/mega4 Use of phylogenetic analysis software tools Bioinformatics

### The Importance of Proper Model Assumption in Bayesian Phylogenetics

Syst. Biol. 53(2):265 277, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490423520 The Importance of Proper Model Assumption in Bayesian

### Thanks to Paul Lewis and Joe Felsenstein for the use of slides

Thanks to Paul Lewis and Joe Felsenstein for the use of slides Review Hennigian logic reconstructs the tree if we know polarity of characters and there is no homoplasy UPGMA infers a tree from a distance

### Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

### Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

### Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)

Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from

### Quantifying sequence similarity

Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

### Dr. Amira A. AL-Hosary

Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

### RELATING PHYSICOCHEMMICAL PROPERTIES OF AMINO ACIDS TO VARIABLE NUCLEOTIDE SUBSTITUTION PATTERNS AMONG SITES ZIHENG YANG

RELATING PHYSICOCHEMMICAL PROPERTIES OF AMINO ACIDS TO VARIABLE NUCLEOTIDE SUBSTITUTION PATTERNS AMONG SITES ZIHENG YANG Department of Biology (Galton Laboratory), University College London, 4 Stephenson

### THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

### Homoplasy. Selection of models of molecular evolution. Evolutionary correction. Saturation

Homoplasy Selection of models of molecular evolution David Posada Homoplasy indicates identity not produced by descent from a common ancestor. Graduate class in Phylogenetics, Campus Agrário de Vairão,

### A phylogenetic view on RNA structure evolution

3 2 9 4 7 3 24 23 22 8 phylogenetic view on RN structure evolution 9 26 6 52 7 5 6 37 57 45 5 84 63 86 77 65 3 74 7 79 8 33 9 97 96 89 47 87 62 32 34 42 73 43 44 4 76 58 75 78 93 39 54 82 99 28 95 52 46

### PHYLOGENY ESTIMATION AND HYPOTHESIS TESTING USING MAXIMUM LIKELIHOOD

Annu. Rev. Ecol. Syst. 1997. 28:437 66 Copyright c 1997 by Annual Reviews Inc. All rights reserved PHYLOGENY ESTIMATION AND HYPOTHESIS TESTING USING MAXIMUM LIKELIHOOD John P. Huelsenbeck Department of

### Points of View JACK SULLIVAN 1 AND DAVID L. SWOFFORD 2

Points of View Syst. Biol. 50(5):723 729, 2001 Should We Use Model-Based Methods for Phylogenetic Inference When We Know That Assumptions About Among-Site Rate Variation and Nucleotide Substitution Pattern

### Molecular Evolution and Phylogenetic Tree Reconstruction

1 4 Molecular Evolution and Phylogenetic Tree Reconstruction 3 2 5 1 4 2 3 5 Orthology, Paralogy, Inparalogs, Outparalogs Phylogenetic Trees Nodes: species Edges: time of independent evolution Edge length

### Estimating Divergence Dates from Molecular Sequences

Estimating Divergence Dates from Molecular Sequences Andrew Rambaut and Lindell Bromham Department of Zoology, University of Oxford The ability to date the time of divergence between lineages using molecular

### Probabilistic modeling and molecular phylogeny

Probabilistic modeling and molecular phylogeny Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) What is a model? Mathematical

### Phylogenetic Inference and Hypothesis Testing. Catherine Lai (92720) BSc(Hons) Department of Mathematics and Statistics University of Melbourne

Phylogenetic Inference and Hypothesis Testing Catherine Lai (92720) BSc(Hons) Department of Mathematics and Statistics University of Melbourne November 13, 2003 Contents 1 Introduction 4 2 Molecular Phylogenetics

### Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution

Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution Ziheng Yang Department of Biology, University College, London An excess of nonsynonymous substitutions

### Reading for Lecture 13 Release v10

Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................

### Understanding relationship between homologous sequences

Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

### Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then

### Are Guinea Pigs Rodents? The Importance of Adequate Models in Molecular Phylogenetics

Journal of Mammalian Evolution, Vol. 4, No. 2, 1997 Are Guinea Pigs Rodents? The Importance of Adequate Models in Molecular Phylogenetics Jack Sullivan1'2 and David L. Swofford1 The monophyly of Rodentia

### Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X