# Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22

Size: px
Start display at page:

Download "Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22"

Transcription

1 Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22

2 The Jukes-Cantor model (1969) A u/3 u/3 G u/3 u/3 u/3 C u/3 T the simplest symmetrical model of DNA evolution Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.2/22

3 Transition probabilities under the Jukes-Cantor model All sites change independently All sites have the same stochastic process working at them Make up a fictional kind of event, such that when it happens the site changes to one of the 4 bases chosen at random (equiprobably) Assertion: Having these events occur at rate 4 3u is the same as having the Jukes-Cantor model events occur at rate u The probability of none of these fictional events happens in time t is exp( 4 3 ut) No matter how many of these fictional events occur, provided it is not zero, the chance of ending up at a particular base is 1 4. Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.3/22

4 Jukes-Cantor transition probabilities, cont d Putting all this together, the probability of changing to C, given the site is currently at A, in time t is Prob (C A, t) = 1 4 (1 e 4 3 ut) while Prob (A A, t) = e 4 3 t (1 e 4 3 ut) or Prob (A A, t) = 1 4 (1 + 3e 4 3 ut) so that the total probability of change is (1 e 4 3 ut ) Prob (change t) = 3 4 Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.4/22

5 Fraction of sites different, Jukes-Cantor 1 Differences per site Branch length after branches of different length, under the Jukes-Cantor model Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.5/22

6 Kimura s (1980) K2P model of DNA change, A a G b b b b C a T which allows for different rates of transitions and transversions, Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.6/22

7 Motoo Kimura Motoo Kimura, with family in Mishima, Japan in the 1960 s Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.7/22

8 Transition probabilities for the K2P model with two kinds of events: I. At rate α, if the site has a purine (A or G), choose one of the two purines at random and change to it. If the site has a pyrimidine (C or T), choose one of the pyrimidines at random and change to it. II. At rate β, choose one of the 4 bases at random and change to it. By proper choice of α and β one can achieve the overall rate of change and T s /T n ratio R you want. For rate of change 1, the transition probabilities (warning: terminological tangle). ) Prob (transition t) = ( exp R+ 1 2 R+1 t Prob (transversion t) = exp ( 2 R+1 t ) exp ( 2 R+1 t ) (the transversion probability is the sum of the probabilities of both kinds of transversions). Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.8/22

9 Transitions, transversions expected Differences Total differences Transitions 0.20 Transversions Time (branch length) R = 10 in different amounts of branch length under the K2P model, for T s /T n = 10 Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.9/22

10 Transitions, transversions expected Total differences Differences Transversions Transitions R = Time (branch length) in different amounts of branch length under the K2P model, for T s /T n = 2 Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.10/22

11 Other commonly used models include: Two models that specify the equilibrium base frequencies (you provide the frequencies π A, π C, π G, π T and they are set up to have an equilibrium which achieves them), and also let you control the transition/transversion ratio: The Hasegawa-Kishino-Yano (1985) model: to : A G C T from : A απ G + βπ G απ C απ T G απ A + βπ A απ C απ T C απ A απ G απ T + βπ T T απ A απ G απ C + βπ C Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.11/22

12 My F84 model to : A G C T from : A απ G + β π G πr απ C απ T G απ A + β π A πr απ C απ T C απ A απ G απ T + βπ T π Y T απ A απ G απ C + β π C πy where π R = π A + π G and π Y = π C + π T (The equilibrium frequencies of purines and pyrimidines) Both of these models have formulas for the transition probabilities, and both are subcases of a slightly more general class of models, the Tamura-Nei model (1993). Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.12/22

13 Reversibility P ij Pji π j π i Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.13/22

14 The General Time-Reversible model (GTR) It maintains detailed balance" so that the probability of starting at (say) A and ending at (say) T in evolution is the same as the probability of starting at T and ending at A: to : A G C T from : A απ G βπ C γπ T G απ A δπ C ɛπ T C βπ A δπ G υπ T T γπ A ɛπ G υπ C And there is of course the general 12-parameter model which has arbitrary rates for each of the 12 possible changes (from each of the 4 nucleotides to each of the 3 others). (Neither of these has formulas for the transition probabilities, but those can be done numerically.) Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.14/22

15 Relation between models There are many other models, but these are the most widely-used ones. Here is a general scheme of which models are subcases of which other ones: General 12 parameter model (12) General time reversible model (9) Tamura Nei (6) HKY (5) F84 (5) Kimura K2P (2) Jukes Cantor (1) Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.15/22

16 Rate variation among sites In reality, rates of evolution are not constant among sites. Fortunately, in the transition probability formulas, rates come in as simple multiples of times Thus if we know the rates at two sites, we can compute the probabilities of change by simply, for each site, multiplying all branch lengths by the appropriate rate If we don t know the rates, we can imagine averaging them over a distribution of rates. Usually the Gamma distribution is used In practice a discrete histogram of rates approximates the integration (For the Gamma it seems best to use Generalized Laguerre Quadrature to pick the rates and frequencies in the histogram). Also, there are actually autocorrelations with neighboring sites having similar rates of change. This can be handled by Hidden Markov Models, which we cover later. Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.16/22

17 A pioneer of protein evolution Margaret Dayhoff, about 1966 Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.17/22

18 Models of amino acid change in proteins There are a variety of models put forward since the mid-1960 s: 1. Amino acid transition matrices Dayhoff (1968) model. Tabulation of empirical changes in closely related pairs of proteins, normalized. The PAM100 matrix, for example, is the expected transition matrix given 1 substitution per position. Jones, Taylor and Thornton (1992) recalculated PAM matrices (the JTT matrix) from a much larger set of data. Jones, Taylor, and Thurnton (1994a, 1994b) have tabulated a separate mutation data matrix for transmembrane proteins. Koshi and Goldstein (1995) have described the tabulation of further context-dependent mutation data matrices. Henikoff and Henikoff (1992) have tabulated the BLOSUM matrix for conserved motifs in gene families. 2. Goldman and Yang (1994) pioneered codon-based models (see next screen). Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.18/22

19 Approaches to protein sequence models Making a model for protein sequence evolution (a not very practical approach) 1. Use a good model of DNA evolution. 2. Use the appropriate genetic code 3. When an amino acid changes, accept it with a probability that declines as the amino acids become more different 4. Fit this to empirical information on protein evolution 5. Take into account variation of rate from site to site 6. Take into account correlation of rate variation in adjacent sites 7. How about protein structure? Secondary structure? 3 D struncture? Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.19/22

20 References Barry, D., and J. A. Hartigan Statistical analysis of hominoid molecular evolution. Statistical Science 2: [Early use of full 12-parameter model] Dayhoff, M. O. and R. V. Eck Atlas of Protein Sequence and Structure National Biomedical Research Foundation, Silver Spring, Maryland. [Dayhoff s PAM modelfor proteins] Goldman, N., and Z. Yang A codon-based model of nucleotide substitution for protein-coding DNA sequences. Molecular Biology and Evolution 11: [codon-based protein/dna models] Hasegawa, M., H. Kishino, and T. Yano Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution 22: [HKY model] Henikoff, S. and J. G. Henikoff Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences, USA 89: [BLOSUM protein model] Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.20/22

21 References Jones, D. T., W. R. Taylor, and J. M. Thornton The rapid generation of mutation data matrices from protein sequences. Computer Applcations in the Biosciences (CABIOS) 8: [JTT model for proteins] Jones, D. T., W. R. Taylor, and J. M. Thornton. 1994a. A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry 33: JTT membrane protein model] Jones, D. T., W. R. Taylor, and J. M. Thornton. 1994b. A mutation data matrix for transmembrane proteins. FEBS Letters 339: [JTT membrane protein model] Jukes, T. H. and C. Cantor Evolution of protein molecules. pp in Mammalian Protein Metabolism, ed. M. N. Munro. Academic Press, New York. [Jukes-Cantor model] Kimura, M A simple model for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution 16: [Kimura s 2-parameter model] Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.21/22

22 References Koshi, J. M. and R. A. Goldstein Context-dependent optimal substitution matrices. Protein Engineering 8: [generating other kinds of protein model matrices] Lanave, C., G. Preparata, C. Saccone, and G. Serio A new method for calculating evolutionary substitution rates. Journal of Molecular Evolution 20: [General reversible model] Lockhart, P. J., M. A. Steel, M. D. Hendy, and D. Penny Recovering evolutionary trees under a more realistic model of sequence evolution. Molecular Biology and Evolution 11: [The LogDet distance for correcting for changing base composition] Tamura, K. and M. Nei Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Molecular Biology and Evolution 10: [Tamura-Nei model] Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.22/22

23 How it was done This projection produced using the prosper style in LaTeX, using Latex to make a.dvi file, using dvips to turn this into a Postscript file, using ps2pdf to mill it into a PDF file, and displaying the slides in Adobe Acrobat Reader. Result: nice slides using freeware. Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.23/22

### Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26

Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 4 (Models of DNA and

### Lecture 4. Models of DNA and protein change. Likelihood methods

Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/36

### Lecture 4. Models of DNA and protein change. Likelihood methods

Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/39

### How should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?

How should we go about modeling this? gorilla GAAGTCCTTGAGAAATAAACTGCACACACTGG orangutan GGACTCCTTGAGAAATAAACTGCACACACTGG Model parameters? Time Substitution rate Can we observe time or subst. rate? What

### Lecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) p.1/30

Lecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) p.1/30 A non-phylogeny

### Lecture Notes: Markov chains

Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity

### Maximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018

Maximum Likelihood Tree Estimation Carrie Tribble IB 200 9 Feb 2018 Outline 1. Tree building process under maximum likelihood 2. Key differences between maximum likelihood and parsimony 3. Some fancy extras

### Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction

### Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).

1 Bioinformatics: In-depth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff

### Week 5: Distance methods, DNA and protein models

Week 5: Distance methods, DNA and protein models Genome 570 February, 2016 Week 5: Distance methods, DNA and protein models p.1/69 A tree and the expected distances it predicts E A 0.08 0.05 0.06 0.03

### Taming the Beast Workshop

Workshop David Rasmussen & arsten Magnus June 27, 2016 1 / 31 Outline of sequence evolution: rate matrices Markov chain model Variable rates amongst different sites: +Γ Implementation in BES2 2 / 31 genotype

### What Is Conservation?

What Is Conservation? Lee A. Newberg February 22, 2005 A Central Dogma Junk DNA mutates at a background rate, but functional DNA exhibits conservation. Today s Question What is this conservation? Lee A.

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics June 1, 2009 Smithsonian Workshop on Molecular Evolution Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut, Storrs, CT Copyright 2009

### Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)

Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from

### Substitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A

GAGATC 3:G A 6:C T Common Ancestor ACGATC 1:A G 2:C A Substitution = Mutation followed 5:T C by Fixation GAAATT 4:A C 1:G A AAAATT GAAATT GAGCTC ACGACC Chimp Human Gorilla Gibbon AAAATT GAAATT GAGCTC ACGACC

### Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

### Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral

### Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood

Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood For: Prof. Partensky Group: Jimin zhu Rama Sharma Sravanthi Polsani Xin Gong Shlomit klopman April. 7. 2003 Table of Contents Introduction...3

### Sequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment

Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the log-likelihood ratio of

### Maximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington.

Maximum Likelihood This presentation is based almost entirely on Peter G. Fosters - "The Idiot s Guide to the Zen of Likelihood in a Nutshell in Seven Days for Dummies, Unleashed. http://www.bioinf.org/molsys/data/idiots.pdf

### Predicting the Evolution of two Genes in the Yeast Saccharomyces Cerevisiae

Available online at wwwsciencedirectcom Procedia Computer Science 11 (01 ) 4 16 Proceedings of the 3rd International Conference on Computational Systems-Biology and Bioinformatics (CSBio 01) Predicting

### Inferring Phylogenies from Protein Sequences by. Parsimony, Distance, and Likelihood Methods. Joseph Felsenstein. Department of Genetics

Inferring Phylogenies from Protein Sequences by Parsimony, Distance, and Likelihood Methods Joseph Felsenstein Department of Genetics University of Washington Box 357360 Seattle, Washington 98195-7360

### Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

### Letter to the Editor. Department of Biology, Arizona State University

Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona

### Mutation models I: basic nucleotide sequence mutation models

Mutation models I: basic nucleotide sequence mutation models Peter Beerli September 3, 009 Mutations are irreversible changes in the DNA. This changes may be introduced by chance, by chemical agents, or

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 26 January 2011 Workshop on Molecular Evolution Český Krumlov, Česká republika Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut,

### Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

### In: M. Salemi and A.-M. Vandamme (eds.). To appear. The. Phylogenetic Handbook. Cambridge University Press, UK.

In: M. Salemi and A.-M. Vandamme (eds.). To appear. The Phylogenetic Handbook. Cambridge University Press, UK. Chapter 4. Nucleotide Substitution Models THEORY Korbinian Strimmer () and Arndt von Haeseler

### In: P. Lemey, M. Salemi and A.-M. Vandamme (eds.). To appear in: The. Chapter 4. Nucleotide Substitution Models

In: P. Lemey, M. Salemi and A.-M. Vandamme (eds.). To appear in: The Phylogenetic Handbook. 2 nd Edition. Cambridge University Press, UK. (final version 21. 9. 2006) Chapter 4. Nucleotide Substitution

### RELATING PHYSICOCHEMMICAL PROPERTIES OF AMINO ACIDS TO VARIABLE NUCLEOTIDE SUBSTITUTION PATTERNS AMONG SITES ZIHENG YANG

RELATING PHYSICOCHEMMICAL PROPERTIES OF AMINO ACIDS TO VARIABLE NUCLEOTIDE SUBSTITUTION PATTERNS AMONG SITES ZIHENG YANG Department of Biology (Galton Laboratory), University College London, 4 Stephenson

### Evolutionary Analysis of Viral Genomes

University of Oxford, Department of Zoology Evolutionary Biology Group Department of Zoology University of Oxford South Parks Road Oxford OX1 3PS, U.K. Fax: +44 1865 271249 Evolutionary Analysis of Viral

### Phylogenetics. BIOL 7711 Computational Bioscience

Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium

### Phylogenetic Inference and Hypothesis Testing. Catherine Lai (92720) BSc(Hons) Department of Mathematics and Statistics University of Melbourne

Phylogenetic Inference and Hypothesis Testing Catherine Lai (92720) BSc(Hons) Department of Mathematics and Statistics University of Melbourne November 13, 2003 Contents 1 Introduction 4 2 Molecular Phylogenetics

### Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)

Bioinformática Sequence Alignment Pairwise Sequence Alignment Universidade da Beira Interior (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) 1 16/3/29 & 23/3/29 27/4/29 Outline

### Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

### Summary statistics, distributions of sums and means

Summary statistics, distributions of sums and means Joe Felsenstein Summary statistics, distributions of sums and means p.1/17 Quantiles In both empirical distributions and in the underlying distribution,

### "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

### BMI/CS 776 Lecture 4. Colin Dewey

BMI/CS 776 Lecture 4 Colin Dewey 2007.02.01 Outline Common nucleotide substitution models Directed graphical models Ancestral sequence inference Poisson process continuous Markov process X t0 X t1 X t2

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 29 July 2014 Workshop on Molecular Evolution Woods Hole, Massachusetts Paul O. Lewis Department of Ecology & Evolutionary Biology Paul O. Lewis (2014 Woods Hole Workshop

### KaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging

Method KaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging Zhang Zhang 1,2,3#, Jun Li 2#, Xiao-Qian Zhao 2,3, Jun Wang 1,2,4, Gane Ka-Shu Wong 2,4,5, and Jun Yu 1,2,4 * 1

### Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then

### EVOLUTIONARY DISTANCE MODEL BASED ON DIFFERENTIAL EQUATION AND MARKOV PROCESS

August 0 Vol 4 No 005-0 JATIT & LLS All rights reserved ISSN: 99-8645 wwwjatitorg E-ISSN: 87-95 EVOLUTIONAY DISTANCE MODEL BASED ON DIFFEENTIAL EUATION AND MAKOV OCESS XIAOFENG WANG College of Mathematical

### Inference of phylogenies, with some thoughts on statistics and geometry p.1/31

Inference of phylogenies, with some thoughts on statistics and geometry Joe Felsenstein University of Washington Inference of phylogenies, with some thoughts on statistics and geometry p.1/31 Darwin s

### Lab 9: Maximum Likelihood and Modeltest

Integrative Biology 200A University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS" Spring 2010 Updated by Nick Matzke Lab 9: Maximum Likelihood and Modeltest In this lab we re going to use PAUP*

### 7.36/7.91 recitation CB Lecture #4

7.36/7.91 recitation 2-19-2014 CB Lecture #4 1 Announcements / Reminders Homework: - PS#1 due Feb. 20th at noon. - Late policy: ½ credit if received within 24 hrs of due date, otherwise no credit - Answer

### MODELING EVOLUTION AT THE PROTEIN LEVEL USING AN ADJUSTABLE AMINO ACID FITNESS MODEL

MODELING EVOLUTION AT THE PROTEIN LEVEL USING AN ADJUSTABLE AMINO ACID FITNESS MODEL MATTHEW W. DIMMIC*, DAVID P. MINDELL RICHARD A. GOLDSTEIN* * Biophysics Research Division Department of Biology and

### Efficiencies of maximum likelihood methods of phylogenetic inferences when different substitution models are used

Molecular Phylogenetics and Evolution 31 (2004) 865 873 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev Efficiencies of maximum likelihood methods of phylogenetic inferences when different

### Dr. Amira A. AL-Hosary

Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

### Quantifying sequence similarity

Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

### How Molecules Evolve. Advantages of Molecular Data for Tree Building. Advantages of Molecular Data for Tree Building

How Molecules Evolve Guest Lecture: Principles and Methods of Systematic Biology 11 November 2013 Chris Simon Approaching phylogenetics from the point of view of the data Understanding how sequences evolve

### Phylogenetics: Building Phylogenetic Trees

1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

### Edward Susko Department of Mathematics and Statistics, Dalhousie University. Introduction. Installation

1 dist est: Estimation of Rates-Across-Sites Distributions in Phylogenetic Subsititution Models Version 1.0 Edward Susko Department of Mathematics and Statistics, Dalhousie University Introduction The

Preliminaries Download PAUP* from: http://people.sc.fsu.edu/~dswofford/paup_test 1 A model of the Boston T System 1 Idea from Paul Lewis A simpler model? 2 Why do models matter? Model-based methods including

### Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 23 July 2013 Workshop on Molecular Evolution Woods Hole, Massachusetts Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut, Storrs,

### Lie Markov models. Jeremy Sumner. School of Physical Sciences University of Tasmania, Australia

Lie Markov models Jeremy Sumner School of Physical Sciences University of Tasmania, Australia Stochastic Modelling Meets Phylogenetics, UTAS, November 2015 Jeremy Sumner Lie Markov models 1 / 23 The theory

### The Phylo- HMM approach to problems in comparative genomics, with examples.

The Phylo- HMM approach to problems in comparative genomics, with examples. Keith Bettinger Introduction The theory of evolution explains the diversity of organisms on Earth by positing that earlier species

### Inferring Molecular Phylogeny

Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction

### Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

### Week 7: Bayesian inference, Testing trees, Bootstraps

Week 7: ayesian inference, Testing trees, ootstraps Genome 570 May, 2008 Week 7: ayesian inference, Testing trees, ootstraps p.1/54 ayes Theorem onditional probability of hypothesis given data is: Prob

### Counting phylogenetic invariants in some simple cases. Joseph Felsenstein. Department of Genetics SK-50. University of Washington

Counting phylogenetic invariants in some simple cases Joseph Felsenstein Department of Genetics SK-50 University of Washington Seattle, Washington 98195 Running Headline: Counting Phylogenetic Invariants

### Likelihood in Phylogenetics

Likelihood in Phylogenetics 22 July 2017 Workshop on Molecular Evolution Woods Hole, Mass. Paul O. Lewis Department of Ecology & Evolutionary Biology Paul O. Lewis (2017 Woods Hole Workshop in Molecular

### 7. Tests for selection

Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info

### Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution

Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution Ziheng Yang Department of Biology, University College, London An excess of nonsynonymous substitutions

### Molecular Evolution, course # Final Exam, May 3, 2006

Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least

### Introduction to MEGA

Introduction to MEGA Download at: http://www.megasoftware.net/index.html Thomas Randall, PhD tarandal@email.unc.edu Manual at: www.megasoftware.net/mega4 Use of phylogenetic analysis software tools Bioinformatics

### THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

### Understanding relationship between homologous sequences

Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

### Modeling Noise in Genetic Sequences

Modeling Noise in Genetic Sequences M. Radavičius 1 and T. Rekašius 2 1 Institute of Mathematics and Informatics, Vilnius, Lithuania 2 Vilnius Gediminas Technical University, Vilnius, Lithuania 1. Introduction:

### Reconstruire le passé biologique modèles, méthodes, performances, limites

Reconstruire le passé biologique modèles, méthodes, performances, limites Olivier Gascuel Centre de Bioinformatique, Biostatistique et Biologie Intégrative C3BI USR 3756 Institut Pasteur & CNRS Reconstruire

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 28 January 2015 Paul O. Lewis Department of Ecology & Evolutionary Biology Workshop on Molecular Evolution Český Krumlov Paul O. Lewis (2015 Czech Republic Workshop

### Estimating Divergence Dates from Molecular Sequences

Estimating Divergence Dates from Molecular Sequences Andrew Rambaut and Lindell Bromham Department of Zoology, University of Oxford The ability to date the time of divergence between lineages using molecular

### Bayesian Analysis of Elapsed Times in Continuous-Time Markov Chains

Bayesian Analysis of Elapsed Times in Continuous-Time Markov Chains Marco A. R. Ferreira 1, Marc A. Suchard 2,3,4 1 Department of Statistics, University of Missouri at Columbia, USA 2 Department of Biomathematics

### POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

### Phylogenetic Assumptions

Substitution Models and the Phylogenetic Assumptions Vivek Jayaswal Lars S. Jermiin COMMONWEALTH OF AUSTRALIA Copyright htregulation WARNING This material has been reproduced and communicated to you by

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 21 July 2015 Workshop on Molecular Evolution Woods Hole, Mass. Paul O. Lewis Department of Ecology & Evolutionary Biology Paul O. Lewis (2015 Woods Hole Workshop in

### Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics 21 July 2015 Workshop on Molecular Evolution Woods Hole, Mass. Paul O. Lewis Department of Ecology & Evolutionary Biology Paul O. Lewis (2015 Woods Hole Workshop in

### Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number

### Models of Molecular Evolution and Phylogeny

REVIEW Models of Molecular Evolution and Phylogeny Pietro Liò and Nick Goldman 1 Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK Phylogenetic reconstruction is a fast-growing field

### The Importance of Proper Model Assumption in Bayesian Phylogenetics

Syst. Biol. 53(2):265 277, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490423520 The Importance of Proper Model Assumption in Bayesian

### Scoring Matrices. Shifra Ben-Dor Irit Orr

Scoring Matrices Shifra Ben-Dor Irit Orr Scoring matrices Sequence alignment and database searching programs compare sequences to each other as a series of characters. All algorithms (programs) for comparison

### Special features of phangorn (Version 2.3.1)

Special features of phangorn (Version 2.3.1) Klaus P. Schliep November 1, 2017 Introduction This document illustrates some of the phangorn [4] specialised features which are useful but maybe not as well-known

### Molecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences

Molecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences Basic Bioinformatics Workshop, ILRI Addis Ababa, 12 December 2017 1 Learning Objectives

### Tutorial on Theoretical Population Genetics

Tutorial on Theoretical Population Genetics Joe Felsenstein Department of Genome Sciences and Department of Biology University of Washington, Seattle Tutorial on Theoretical Population Genetics p.1/40

### Probabilistic modeling and molecular phylogeny

Probabilistic modeling and molecular phylogeny Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) What is a model? Mathematical

### EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

### Natural selection on the molecular level

Natural selection on the molecular level Fundamentals of molecular evolution How DNA and protein sequences evolve? Genetic variability in evolution } Mutations } forming novel alleles } Inversions } change

### Molecular Evolution and Comparative Genomics

Molecular Evolution and Comparative Genomics --- the phylogenetic HMM model 10-810, CMB lecture 5---Eric Xing Some important dates in history (billions of years ago) Origin of the universe 15 ±4 Formation

### Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

### CISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I)

CISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I) Contents Alignment algorithms Needleman-Wunsch (global alignment) Smith-Waterman (local alignment) Heuristic algorithms FASTA BLAST

### Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#\$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods

### MANNINO, FRANK VINCENT. Site-to-Site Rate Variation in Protein Coding

Abstract MANNINO, FRANK VINCENT. Site-to-Site Rate Variation in Protein Coding Genes. (Under the direction of Spencer V. Muse) The ability to realistically model gene evolution improved dramatically with

### Sequence Alignment Techniques and Their Uses

Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this

### Sequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University

Sequence Alignment: A General Overview COMP 571 - Fall 2010 Luay Nakhleh, Rice University Life through Evolution All living organisms are related to each other through evolution This means: any pair of

### Bioinformatics. Scoring Matrices. David Gilbert Bioinformatics Research Centre

Bioinformatics Scoring Matrices David Gilbert Bioinformatics Research Centre www.brc.dcs.gla.ac.uk Department of Computing Science, University of Glasgow Learning Objectives To explain the requirement

### Sequence Alignment: Scoring Schemes. COMP 571 Luay Nakhleh, Rice University

Sequence Alignment: Scoring Schemes COMP 571 Luay Nakhleh, Rice University Scoring Schemes Recall that an alignment score is aimed at providing a scale to measure the degree of similarity (or difference)

### Efficiencies of Different Genes and Different Tree-building in Recovering a Known Vertebrate Phylogeny

Efficiencies of Different Genes and Different Tree-building in Recovering a Known Vertebrate Phylogeny Methods Claudia A. M. RUSSO,~ Naoko Takezaki, and Masatoshi Institute of Molecular Evolutionary Genetics

### Kei Takahashi and Masatoshi Nei

Efficiencies of Fast Algorithms of Phylogenetic Inference Under the Criteria of Maximum Parsimony, Minimum Evolution, and Maximum Likelihood When a Large Number of Sequences Are Used Kei Takahashi and