# Molecular Evolution and Phylogenetic Tree Reconstruction

Size: px
Start display at page:

Transcription

1 1 4 Molecular Evolution and Phylogenetic Tree Reconstruction

2 Orthology, Paralogy, Inparalogs, Outparalogs

3 Phylogenetic Trees Nodes: species Edges: time of independent evolution Edge length represents evolution time AKA genetic distance Not necessarily chronological time

4 Inferring Phylogenetic Trees Trees can be inferred by several criteria: Morphology of the organisms Can lead to mistakes Sequence comparison Example: Mouse: ACAGTGACGCCCCAAACGT Rat: ACAGTGACGCTACAAACGT Baboon: CCTGTGACGTAACAAACGA Chimp: CCTGTGACGTAGCAAACGA Human: CCTGTGACGTAGCAAACGA

5 Distance Between Two Sequences Basic principle: Distance proportional to degree of independent sequence evolution Given sequences x i, x j, d ij = distance between the two sequences One possible definition: d ij = fraction f of sites u where x i [u] x j [u] Better scores are derived by modeling evolution as a continuous change process

6 Molecular Evolution Modeling sequence substitution: Consider what happens at a position for time Δt, P(t) = vector of probabilities of {A,C,G,T} at time t µ AC = rate of transition from A to C per unit time µ A = µ AC + µ AG + µ AT rate of transition out of A p A (t+δt) = p A (t) p A (t) µ A Δt + p C (t) µ CA Δt + p G (t) µ GA Δt + p T (t) µ TA Δt

7 Molecular Evolution In matrix/vector notation, we get P(t+Δt) = P(t) + Q P(t) Δt where Q is the substitution rate matrix

8 Molecular Evolution This is a differential equation: P (t) = Q P(t) Q => prob. distribution over {A,C,G,T} at each position, stationary (equilibrium) frequencies π A, π C, π G, π T Each Q is an evolutionary model Some work better than others

9 Evolutionary Models Jukes-Cantor Kimura Felsenstein HKY

10 Estimating Distances Solve the differential equation and compute expected evolutionary time given sequences P (t) = Q P(t) Jukes-Cantor: Let P AA (t) = P CC (t) = P CC (t) = P CC (t) = r P AC (t) = = P TG (t) = s Then, r (t) = - ¾ r(t) µ + ¾ s(t) µ s (t) = - ¼ s(t) µ + ¼ r(t) µ Which is satisfied by r(t) = ¼ (1 + 3e -µt ) s(t) = ¼ (1 - e -µt )

11 Estimating Distances Solve the differential equation and compute expected evolutionary time given sequences P (t) = Q P(t) Jukes-Cantor:

12 Estimating Distances Let p = probability a base is different between two sequences, Solve to find t Jukes-Cantor r(t) = 1 p = ¼ (1 + 3e -µt ) p = ¾ ¾ e -µt Therefore, ¾ p = ¾ e -µt 1 4p/3 = e -µt µt = - ln(1 4p/3) Letting d = ¾ µt, denoting substitutions per site,

13 Estimating Distances d: Branch length in terms of substitutions per site Jukes-Cantor Kimura

14 Simple method for building tree: UPGMA UPGMA (unweighted pair group method using arithmetic averages) Or the Average Linkage Method Given two disjoint clusters C i, C j of sequences, 1 d ij = Σ {p Ci, q Cj} d pq C i C j Claim that if C k = C i C j, then distance to another cluster C l is: d il C i + d jl C j d kl = C i + C j

15 Algorithm: Average Linkage Initialization: Assign each x i into its own cluster C i Define one leaf per sequence, height Iteration: Find two clusters C i, C j s.t. d ij is min Let C k = C i C j Define node connecting C i, C j, and place it at height d ij /2 Delete C i, C j Termination: When two clusters i, j remain, place root at height d ij /

16 Average Linkage Example v w x y z v w x y 0 2 z 0 v w x yz v w xyz v w 0 8 xyz 0 vw xyz vw 0 8 xyz 0 4 v w x 0 4 yz 0 3 v w x y 2 z 1

17 Ultrametric Distances and Molecular Clock Definition: A distance function d(.,.) is ultrametric if for any three distances d ij d ik d ij, it is true that The Molecular Clock: d ij d ik = d jk The evolutionary distance between species x and y is 2 the Earth time to reach the nearest common ancestor That is, the molecular clock has constant rate in all species years The molecular clock results in ultrametric distances

18 Ultrametric Distances & Average Linkage Average Linkage is guaranteed to reconstruct correctly a binary tree with ultrametric distances Proof: Exercise

19 Weakness of Average Linkage Molecular clock: all species evolve at the same rate (Earth time) However, certain species (e.g., mouse, rat) evolve much faster Example where UPGMA messes up: Correct tree AL tree

20 Additive Distances d 1, Given a tree, a distance measure is additive if the distance between any pair of leaves is the sum of lengths of edges connecting them Given a tree T & additive distances d ij, can uniquely reconstruct edge lengths: Find two neighboring leaves i, j, with common parent k Place parent node k at distance d km = ½ (d im + d jm d ij ) from any node m i, j

21 Additive Distances x z y w For any four leaves x, y, z, w, consider the three sums d(x, y) + d(z, w) d(x, z) + d(y, w) d(x, w) + d(y, z) One of them is smaller than the other two, which are equal d(x, y) + d(z, w) < d(x, z) + d(y, w) = d(x, w) + d(y, z)

22 Reconstructing Additive Distances Given T x D y 4 5 T v w x y z v z w w x y 0 14 z 0 If we know T and D, but do not know the length of each leaf, we can reconstruct those lengths 6 v

23 Reconstructing Additive Distances Given T x D y T v w x y z v z w w x v y 0 14 z 0

24 Reconstructing Additive Distances Given T D v w x y z v w x y 0 14 z 0 x y z a T w D 1 a x y z a x y 0 14 z 0 d ax = ½ (d vx + d wx d vw ) d ay = ½ (d vy + d wy d vw ) d az = ½ (d vz + d wz d vw ) v

25 Reconstructing Additive Distances Given T D 1 x a x y z a y 5 x b y z z c a 4 T w D 2 a b z a b 0 10 z 0 D 3 a c a 0 3 c 0 6 d(a, c) = 3 d(b, c) = d(a, b) d(a, c) = 3 d(c, z) = d(a, z) d(a, c) = 7 d(b, x) = d(a, x) d(a, b) = 5 d(b, y) = d(a, y) d(a, b) = 4 d(a, w) = d(z, w) d(a, z) = 4 d(a, v) = d(z, v) d(a, z) = 6 Correct!!! v

26 Neighbor-Joining Guaranteed to produce the correct tree if distance is additive May produce a good tree even when distance is not additive 1 3 Step 1: Finding neighboring leaves Define D ij = (N 2) d ij k i d ik k j d jk Claim: The above magic trick ensures that i, j are neighbors if D ij is minimal

27 Neighbor-Joining D ij = (N 2) d ij k i d ik k j d jk

28 Neighbor-Joining D ij = (N 2) d ij k i d ik k j d jk All leaf edges appear negatively exactly twice All other edges appear negatively once for every path from each of the two leaves i, j, to leaves k i, j

29 Some Trees

30 Some Trees

31 Some Trees

32 Some Trees

33 Some Trees

34 Some Trees

### CSCI1950 Z Computa4onal Methods for Biology Lecture 4. Ben Raphael February 2, hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary

CSCI1950 Z Computa4onal Methods for Biology Lecture 4 Ben Raphael February 2, 2009 hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary Parsimony Probabilis4c Method Input Output Sankoff s & Fitch

### Phylogenetic trees 07/10/13

Phylogenetic trees 07/10/13 A tree is the only figure to occur in On the Origin of Species by Charles Darwin. It is a graphical representation of the evolutionary relationships among entities that share

### Phylogeny: traditional and Bayesian approaches

Phylogeny: traditional and Bayesian approaches 5-Feb-2014 DEKM book Notes from Dr. B. John Holder and Lewis, Nature Reviews Genetics 4, 275-284, 2003 1 Phylogeny A graph depicting the ancestor-descendent

### EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

### Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then

### Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction

### CSCI1950 Z Computa4onal Methods for Biology Lecture 5

CSCI1950 Z Computa4onal Methods for Biology Lecture 5 Ben Raphael February 6, 2009 hip://cs.brown.edu/courses/csci1950 z/ Alignment vs. Distance Matrix Mouse: ACAGTGACGCCACACACGT Gorilla: CCTGCGACGTAACAAACGC

### Phylogeny: building the tree of life

Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan

### BINF6201/8201. Molecular phylogenetic methods

BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

### Evolutionary Tree Analysis. Overview

CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based

### Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

### Phylogenetic Tree Reconstruction

I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

### Inferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies

Inferring Phylogenetic Trees Distance Approaches Representing distances in rooted and unrooted trees The distance approach to phylogenies given: an n n matrix M where M ij is the distance between taxa

### Theory of Evolution Charles Darwin

Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

### Math 239: Discrete Mathematics for the Life Sciences Spring Lecture 14 March 11. Scribe/ Editor: Maria Angelica Cueto/ C.E.

Math 239: Discrete Mathematics for the Life Sciences Spring 2008 Lecture 14 March 11 Lecturer: Lior Pachter Scribe/ Editor: Maria Angelica Cueto/ C.E. Csar 14.1 Introduction The goal of today s lecture

### CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003

CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1 Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003 Lecturer: Wing-Kin Sung Scribe: Ning K., Shan T., Xiang

### POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

### Algorithms in Bioinformatics

Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

### Reconstruction of certain phylogenetic networks from their tree-average distances

Reconstruction of certain phylogenetic networks from their tree-average distances Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu October 10,

### Phylogenetic inference

Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

### Lecture 18 Molecular Evolution and Phylogenetics

6.047/6.878 - Computational Biology: Genomes, Networks, Evolution Lecture 18 Molecular Evolution and Phylogenetics Patrick Winston s 6.034 Somewhere, something went wrong 1 Challenges in Computational

### Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony

### Substitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A

GAGATC 3:G A 6:C T Common Ancestor ACGATC 1:A G 2:C A Substitution = Mutation followed 5:T C by Fixation GAAATT 4:A C 1:G A AAAATT GAAATT GAGCTC ACGACC Chimp Human Gorilla Gibbon AAAATT GAAATT GAGCTC ACGACC

### Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Phylogeny and Evolution Gina Cannarozzi ETH Zurich Institute of Computational Science History Aristotle (384-322 BC) classified animals. He found that dolphins do not belong to the fish but to the mammals.

### "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

### Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)

Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from

### Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels

### Concepts and Methods in Molecular Divergence Time Estimation

Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks

### Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

### Phylogenetics: Building Phylogenetic Trees

1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

### 9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

### Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

### Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

### Hierarchical Clustering

Hierarchical Clustering Some slides by Serafim Batzoglou 1 From expression profiles to distances From the Raw Data matrix we compute the similarity matrix S. S ij reflects the similarity of the expression

### Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number

### Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary

### Dr. Amira A. AL-Hosary

Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

### Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istance-based vs. character-based methods. istance-based: Ultrametric Trees dditive Trees. haracter-based: Perfect phylogeny

### Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).

1 Bioinformatics: In-depth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff

### Phylogenetics. BIOL 7711 Computational Bioscience

Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium

### Week 5: Distance methods, DNA and protein models

Week 5: Distance methods, DNA and protein models Genome 570 February, 2016 Week 5: Distance methods, DNA and protein models p.1/69 A tree and the expected distances it predicts E A 0.08 0.05 0.06 0.03

### Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#\$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods

### Using algebraic geometry for phylogenetic reconstruction

Using algebraic geometry for phylogenetic reconstruction Marta Casanellas i Rius (joint work with Jesús Fernández-Sánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya IMA

### Lecture 10: Phylogeny

Computational Genomics Prof. Ron Shamir & Prof. Roded Sharan School of Computer Science, Tel Aviv University גנומיקה חישובית פרופ' רון שמיר ופרופ' רודד שרן ביה"ס למדעי המחשב,אוניברסיטת תל אביב Lecture

### Phylogeny Tree Algorithms

Phylogeny Tree lgorithms Jianlin heng, PhD School of Electrical Engineering and omputer Science University of entral Florida 2006 Free for academic use. opyright @ Jianlin heng & original sources for some

### Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

ioinformatics -- lecture 9 Phylogenetic trees istance-based tree building Parsimony (,(,(,))) rees can be represented in "parenthesis notation". Each set of parentheses represents a branch-point (bifurcation),

### Tree-average distances on certain phylogenetic networks have their weights uniquely determined

Tree-average distances on certain phylogenetic networks have their weights uniquely determined Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu

### Multiple Sequence Alignment. Sequences

Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe

### Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods

### Improving divergence time estimation in phylogenetics: more taxa vs. longer sequences

Mathematical Statistics Stockholm University Improving divergence time estimation in phylogenetics: more taxa vs. longer sequences Bodil Svennblad Tom Britton Research Report 2007:2 ISSN 650-0377 Postal

### Phylogeny Jan 5, 2016

גנומיקה חישובית Computational Genomics Phylogeny Jan 5, 2016 Slides: Adi Akavia Nir Friedman s slides at HUJI (based on ALGMB 98) Anders Gorm Pedersen,Technical University of Denmark Sources: Joe Felsenstein

### Reconstruire le passé biologique modèles, méthodes, performances, limites

Reconstruire le passé biologique modèles, méthodes, performances, limites Olivier Gascuel Centre de Bioinformatique, Biostatistique et Biologie Intégrative C3BI USR 3756 Institut Pasteur & CNRS Reconstruire

### What is Phylogenetics

What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)

### Theory of Evolution. Charles Darwin

Theory of Evolution harles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (8-6) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

### Reading for Lecture 13 Release v10

Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................

### Constructing Evolutionary Trees

Constructing Evolutionary Trees 0-0 HIV Evolutionary Tree SIVs (monkeys)! HIV (human)! human infection! human HIV/M human HIV/M chimpanzee SIV chimpanzee SIV human HIV/N human HIV/N chimpanzee SIV chimpanzee

### THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

### Modeling Noise in Genetic Sequences

Modeling Noise in Genetic Sequences M. Radavičius 1 and T. Rekašius 2 1 Institute of Mathematics and Informatics, Vilnius, Lithuania 2 Vilnius Gediminas Technical University, Vilnius, Lithuania 1. Introduction:

### Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

### C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

### Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

### RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION MAGNUS BORDEWICH, KATHARINA T. HUBER, VINCENT MOULTON, AND CHARLES SEMPLE Abstract. Phylogenetic networks are a type of leaf-labelled,

### Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral

### A few logs suce to build (almost) all trees: Part II

Theoretical Computer Science 221 (1999) 77 118 www.elsevier.com/locate/tcs A few logs suce to build (almost) all trees: Part II Peter L. Erdős a;, Michael A. Steel b,laszlo A.Szekely c, Tandy J. Warnow

### Maximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018

Maximum Likelihood Tree Estimation Carrie Tribble IB 200 9 Feb 2018 Outline 1. Tree building process under maximum likelihood 2. Key differences between maximum likelihood and parsimony 3. Some fancy extras

### Phylogenetic Assumptions

Substitution Models and the Phylogenetic Assumptions Vivek Jayaswal Lars S. Jermiin COMMONWEALTH OF AUSTRALIA Copyright htregulation WARNING This material has been reproduced and communicated to you by

### Minimum evolution using ordinary least-squares is less robust than neighbor-joining

Minimum evolution using ordinary least-squares is less robust than neighbor-joining Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA email: swillson@iastate.edu November

### Variances of the Average Numbers of Nucleotide Substitutions Within and Between Populations

Variances of the Average Numbers of Nucleotide Substitutions Within and Between Populations Masatoshi Nei and Li Jin Center for Demographic and Population Genetics, Graduate School of Biomedical Sciences,

### Introduction to Bioinformatics Introduction to Bioinformatics

Dr. rer. nat. Gong Jing Cancer Research Center Medicine School of Shandong University 2012.11.09 1 Chapter 4 Phylogenetic Tree 2 Phylogeny Evidence from morphological ( 形态学的 ), biochemical, and gene sequence

### Identity. "At least one dog has fleas" is translated by an existential quantifier"

Identity Quantifiers are so-called because they say how many. So far, we've only used the quantifiers to give the crudest possible answers to the question "How many dogs have fleas?": "All," "None," "Some,"

### Consistency Index (CI)

Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)

### Evolutionary Models. Evolutionary Models

Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment

### Understanding relationship between homologous sequences

Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

### A (short) introduction to phylogenetics

A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field

### Maximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington.

Maximum Likelihood This presentation is based almost entirely on Peter G. Fosters - "The Idiot s Guide to the Zen of Likelihood in a Nutshell in Seven Days for Dummies, Unleashed. http://www.bioinf.org/molsys/data/idiots.pdf

### What Is Conservation?

What Is Conservation? Lee A. Newberg February 22, 2005 A Central Dogma Junk DNA mutates at a background rate, but functional DNA exhibits conservation. Today s Question What is this conservation? Lee A.

### Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço jcarrico@fm.ul.pt Charles Darwin (1809-1882) Charles Darwin s tree of life in Notebook B, 1837-1838 Ernst Haeckel (1934-1919)

### 3D Elasticity Theory

3D lasticity Theory Many structural analysis problems are analysed using the theory of elasticity in which Hooke s law is used to enforce proportionality between stress and strain at any deformation level.

### p(-,i)+p(,i)+p(-,v)+p(i,v),v)+p(i,v)

Multile Sequence Alignment Given: Set of sequences Score matrix Ga enalties Find: Alignment of sequences such that otimal score is achieved. Motivation Aligning rotein families Establish evolutionary relationshis

### Neighbor Joining Algorithms for Inferring Phylogenies via LCA-Distances

Neighbor Joining Algorithms for Inferring Phylogenies via LCA-Distances Ilan Gronau Shlomo Moran September 6, 2006 Abstract Reconstructing phylogenetic trees efficiently and accurately from distance estimates

### Sequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment

Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the log-likelihood ratio of

### Let S be a set of n species. A phylogeny is a rooted tree with n leaves, each of which is uniquely

JOURNAL OF COMPUTATIONAL BIOLOGY Volume 8, Number 1, 2001 Mary Ann Liebert, Inc. Pp. 69 78 Perfect Phylogenetic Networks with Recombination LUSHENG WANG, 1 KAIZHONG ZHANG, 2 and LOUXIN ZHANG 3 ABSTRACT

### Chapter 3: Phylogenetics

Chapter 3: Phylogenetics 3. Computing Phylogeny Prof. Yechiam Yemini (YY) Computer Science epartment Columbia niversity Overview Computing trees istance-based techniques Maximal Parsimony (MP) techniques

### BMI/CS 776 Lecture 4. Colin Dewey

BMI/CS 776 Lecture 4 Colin Dewey 2007.02.01 Outline Common nucleotide substitution models Directed graphical models Ancestral sequence inference Poisson process continuous Markov process X t0 X t1 X t2

### Molecular Phylogenetics (Hannes Luz)

ontents: Molecular Phylogenetics (Hannes Luz) Phylogenetic Trees, basic notions A character based method: Maximum Parsimony Trees from distances Markov Models of Sequence Evolution, Maximum Likelihood

### Molecular Evolution, course # Final Exam, May 3, 2006

Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least

### Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

### (Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise

Bot 421/521 PHYLOGENETIC ANALYSIS I. Origins A. Hennig 1950 (German edition) Phylogenetic Systematics 1966 B. Zimmerman (Germany, 1930 s) C. Wagner (Michigan, 1920-2000) II. Characters and character states

### Evolutionary Analysis of Viral Genomes

University of Oxford, Department of Zoology Evolutionary Biology Group Department of Zoology University of Oxford South Parks Road Oxford OX1 3PS, U.K. Fax: +44 1865 271249 Evolutionary Analysis of Viral

### 66 Bioinformatics I, WS 09-10, D. Huson, December 1, Evolutionary tree of organisms, Ernst Haeckel, 1866

66 Bioinformatics I, WS 09-10, D. Huson, December 1, 2009 5 Phylogeny Evolutionary tree of organisms, Ernst Haeckel, 1866 5.1 References J. Felsenstein, Inferring Phylogenies, Sinauer, 2004. C. Semple

### Supplementary Information

Supplementary Information For the article"comparable system-level organization of Archaea and ukaryotes" by J. Podani, Z. N. Oltvai, H. Jeong, B. Tombor, A.-L. Barabási, and. Szathmáry (reference numbers

### Phylogenetic inference: from sequences to trees

W ESTFÄLISCHE W ESTFÄLISCHE W ILHELMS -U NIVERSITÄT NIVERSITÄT WILHELMS-U ÜNSTER MM ÜNSTER VOLUTIONARY FUNCTIONAL UNCTIONAL GENOMICS ENOMICS EVOLUTIONARY Bioinformatics 1 Phylogenetic inference: from sequences

### InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic

### The Generalized Neighbor Joining method

The Generalized Neighbor Joining method Ruriko Yoshida Dept. of Mathematics Duke University Joint work with Dan Levy and Lior Pachter www.math.duke.edu/ ruriko data mining 1 Challenge We would like to

### Sequence Analysis '17- lecture 8. Multiple sequence alignment

Sequence Analysis '17- lecture 8 Multiple sequence alignment Ex5 explanation How many random database search scores have e-values 10? (Answer: 10!) Why? e-value of x = m*p(s x), where m is the database

### Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Evolutionary trees Bonobo Chimpanzee Human Neanderthal Gorilla Orangutan Describe the relationship between objects, e.g. species or genes Early evolutionary studies The evolutionary relationships between