Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Similar documents
Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Phylogeny and Molecular Evolution. Introduction

Phylogeny and Molecular Evolution. Introduction

Multiple Sequence Alignment. Sequences

Phylogenetic Trees. How do the changes in gene sequences allow us to reconstruct the evolutionary relationships between related species?

Phylogeny: building the tree of life

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Lecture 11 Friday, October 21, 2011

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Phylogenetic inference

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES

Algorithms in Bioinformatics

Constructing Evolutionary/Phylogenetic Trees

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Evolutionary Tree Analysis. Overview

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

BINF6201/8201. Molecular phylogenetic methods

Theory of Evolution Charles Darwin

Dr. Amira A. AL-Hosary

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences


Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

C3020 Molecular Evolution. Exercises #3: Phylogenetics

A Phylogenetic Network Construction due to Constrained Recombination

EVOLUTIONARY DISTANCES

Inferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies

What is Phylogenetics

Phylogenetic Tree Reconstruction

C.DARWIN ( )

Phylogeny: traditional and Bayesian approaches

CSCI1950 Z Computa4onal Methods for Biology Lecture 5

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Warm-Up- Review Natural Selection and Reproduction for quiz today!!!! Notes on Evidence of Evolution Work on Vocabulary and Lab

Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Classification and Phylogeny

SCIENTIFIC EVIDENCE TO SUPPORT THE THEORY OF EVOLUTION. Using Anatomy, Embryology, Biochemistry, and Paleontology

Classification and Phylogeny

Organizing Life s Diversity

Session 5: Phylogenomics

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Cladistics and Bioinformatics Questions 2013

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetic trees 07/10/13

Theory of Evolution. Charles Darwin

CSCI1950 Z Computa3onal Methods for Biology* (*Working Title) Lecture 1. Ben Raphael January 21, Course Par3culars

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

Phylogenetics: Building Phylogenetic Trees

Plan: Evolutionary trees, characters. Perfect phylogeny Methods: NJ, parsimony, max likelihood, Quartet method

Reconstruire le passé biologique modèles, méthodes, performances, limites

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki

9/19/2012. Chapter 17 Organizing Life s Diversity. Early Systems of Classification

Constructing Evolutionary/Phylogenetic Trees

A (short) introduction to phylogenetics

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Molecular evolution. Joe Felsenstein. GENOME 453, Autumn Molecular evolution p.1/49

How to read and make phylogenetic trees Zuzana Starostová

Chapter 16: Reconstructing and Using Phylogenies

Skulls & Evolution. Procedure In this lab, groups at the same table will work together.

31/10/2012. Human Evolution. Cytochrome c DNA tree

Phylogeny. November 7, 2017

Introduction to characters and parsimony analysis

Evolution Problem Drill 10: Human Evolution

I. Short Answer Questions DO ALL QUESTIONS

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

Classification, Phylogeny yand Evolutionary History

Name: Class: Date: ID: A

--Therefore, congruence among all postulated homologies provides a test of any single character in question [the central epistemological advance].

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies

Constructing Evolutionary Trees

Supplementary Information

8/23/2014. Phylogeny and the Tree of Life

Understanding relationship between homologous sequences

Evolutionary Analysis, 5e (Herron/Freeman) Chapter 2 The Pattern of Evolution

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

List of Code Challenges. About the Textbook Meet the Authors... xix Meet the Development Team... xx Acknowledgments... xxi

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

Thanks to Paul Lewis and Joe Felsenstein for the use of slides

Mul$ple Sequence Alignment Methods. Tandy Warnow Departments of Bioengineering and Computer Science h?p://tandy.cs.illinois.edu

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Phylogenetic analyses. Kirsi Kostamo

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Estimating Evolutionary Trees. Phylogenetic Methods

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES

Phylogenetic inference: from sequences to trees

Maximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington.

Welcome to Evolution 101 Reading Guide

Phylogenetics: Parsimony

Phylogenomics, Multiple Sequence Alignment, and Metagenomics. Tandy Warnow University of Illinois at Urbana-Champaign

Microbial Taxonomy. Microbes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible.

MOLECULAR EVOLUTION AND PHYLOGENETICS SERGEI L KOSAKOVSKY POND CSE/BIMM/BENG 181 MAY 27, 2011

MULTIPLE SEQUENCE ALIGNMENT FOR CONSTRUCTION OF PHYLOGENETIC TREE

Transcription:

Evolutionary trees Bonobo Chimpanzee Human Neanderthal Gorilla Orangutan Describe the relationship between objects, e.g. species or genes

Early evolutionary studies The evolutionary relationships between species is referred to as there phylogeny (often referred to as there phylogenetic or evolutionary tree) Anatomical features were the dominant criteria used to derive evolutionary relationships between species since Darwin till early 1960s The evolutionary relationships derived from these relatively subjective observations were often inconclusive, and some of them were later proved incorrect

Early evolutionary studies The evolutionary relationships between species is referred to as there phylogeny (often referred to as there phylogenetic or evolutionary tree) The Giant Panda riddle Anatomical features were the dominant For roughly 100 years scientists were criteria used to derive evolutionary unable to figure out which family the relationships between species since giant panda belongs to Darwin till early 1960s The evolutionary relationships derived from these relatively subjective observations were often inconclusive, and some of them were later proved incorrect Giant pandas look like bears but have features that are unusual for bears and typical for raccoons, e.g., they do not hibernate In 1985, Steven O Brien and colleagues solved the giant panda classification problem using DNA sequences and algorithms

Evolutionary trees: DNA-based approach 50 years ago: Emile Zuckerkandl and Linus Pauling brought reconstructing evolutionary relationships with DNA into the spotlight In the first few years after Zuckerkandl and Pauling proposed using DNA for evolutionary studies, the possibility of reconstructing evolutionary trees by DNA analysis was hotly debated Now it is a dominant approach to study evolution

Evolutionary trees: DNA-based approach 40 years ago: Emile Zuckerkandl and Linus Pauling brought reconstructing evolutionary relationships From with the DNA point of hemoglobin structure, it appears that into the spotlight gorilla is just an abnormal human, or man an abnormal gorilla, and the two species form actually one continuous In the first few years after population. Zuckerkandl and Pauling proposed using DNA for evolutionary studies, the Emile possibility Zuckerkandl, of reconstructing evolutionary Classification trees by and Human Evolution, 1963 DNA analysis was hotly debated From any point of view other than that properly Now it is a dominant approach specified, to that study is of course nonsense. What the evolution comparison really indicate is that hemoglobin is a bad choice and has nothing to tell us about attributes, or indeed tells us a lie. Gaylord Simpson, Science, 1964

A phylogenetic tree Bonobo Chimpanzee Human Neanderthal Gorilla Orangutan

Molecular evolution 100 million years ago AGTAGCAGTACGATACG AGTTGCAGTAGGATATG Today

The molecular clock If an equal amount of changes occur on every evolutionary path, then the phylogenetic tree obeys the molecular clock hypothesis. However, DNA rarely evolve at a constant rate across different lineages.

Genetic distances Genetic distances can be inferred e.g. by comparing DNA sequences. Yield information about pairwise distances. Human Neanderthal Human Chimpanzee Human Bonobo Neanderthal Chimpanzee Neanderthal Bonobo Chimpanzee Bonobo

Reconstruct the phylogenetic tree Bonobo Chimpanzee Human Neanderthal Construct a tree (topology and edge lengths) that reflects the inferred pairwise distances as good as possible.

Species trees and Gene trees The species tree can differs from a tree built from homologous genes

A speculatively rooted tree for rrna genes

Terminology Labeling / annotation leaves (and nodes) are labelled with species/objects (1, 2,..., n) edge lengths reflect amount of evolution/time along the edge Tree structure / topology rooted vs. unrooted binary (bifurcating) vs. non-binary (multifurcating)

Distances in trees j i d 1,4 = 12 + 13 + 14 + 17 + 13 = 69

Layout don t matter... these two rooted binary trees are considered equal...

Newick format http://en.wikipedia.org/wiki/newick_format http://biopython.org/wiki/phylo

Reconstructing trees Reconstruct the unknown true tree from info about the species Character-based information Each species is characterized by a number of characters which can be in different states, e.g. a multiple alignment: species 1: acgcta--cacacagt species 2: acgcca---acg-agt species 3: acg--attt-agc--t Distance-based information The distance D(i,j) is given for each pair of species

Distance based reconstruction The ideal case The tree reflects the distance matrix, i.e. the distances induced by the tree are equal to the distances in the matrix If this is possible, then the distance matrix is called additive Reconstruction problem Find both the tree topology and the edge lengths...

Distance based reconstruction Goal: Reconstruct an evolutionary tree from a distance matrix Input: n x n distance matrix D ij Output: weighted tree T with n leaves fitting D If D is additive, this problem has an efficient solution (which we will come back to), otherwise we should aim at finding a tree which a good as possible

A reconstruction method Least square methods Given a tree topology T, select edge lengths such that Q(T) = n n wij(dij dij) 2 i=1 j=1 is minimized, where D ij is given by the matrix, d ij is the distance induced by the tree and the selected edge lengths, and w ij is the confidence we have in D ij, e.g. w ij =1. Optimal edge lengths can be found by standard least square methods in time O(n 3 ). The large version of the problem when the tree is unknown, e.g. we must also minimize over T, is NP-complete...

Two practical algorithms UPGMA constructs good trees (rooted binary trees) if a molecular clock is thought to be a reasonable assumption, if the distance matrix comes from a clocklike tree, then UPGMA finds the tree that is the exact representation of the matrix Neighbor-Joining constructs trees (unrooted binary trees) that can be seen as a rough approximations to least-square methods, if the distance matrix is additive, then Neighbor-Joining finds the tree that is the exact representation of the matrix (if this is a binary). Check Dendroscope for implementations of various tree reconstruction methods and visualizations: http://ab.inf.uni-tuebingen.de/software/dendroscope/

Counting rooted binary trees

Counting rooted binary trees A rooted binary tree with n leaves has 2n-2 edges. A new species can be added to any edge or above the root, i.e: R(n) = R(n-1) (2(n-1)-2 + 1) = R(n-1) (2n - 3) What about unrooted trees? What about non-binary trees?