Phylogenetic trees 07/10/13

Similar documents
Evolutionary Tree Analysis. Overview

EVOLUTIONARY DISTANCES


Molecular Evolution and Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction

Algorithms in Bioinformatics

Theory of Evolution Charles Darwin

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

CSCI1950 Z Computa4onal Methods for Biology Lecture 5

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

What is Phylogenetics

Seuqence Analysis '17--lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony

BINF6201/8201. Molecular phylogenetic methods

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Week 5: Distance methods, DNA and protein models

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

Theory of Evolution. Charles Darwin

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Evolutionary Trees. Evolutionary tree. To describe the evolutionary relationship among species A 3 A 2 A 4. R.C.T. Lee and Chin Lung Lu

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Consistency Index (CI)

Phylogeny: traditional and Bayesian approaches

Multiple Sequence Alignment. Sequences

Constructing Evolutionary/Phylogenetic Trees

Phylogenetic inference

Phylogeny Jan 5, 2016

CSCI1950 Z Computa4onal Methods for Biology Lecture 4. Ben Raphael February 2, hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

Introduction to Bioinformatics Introduction to Bioinformatics

Phylogenetics: Building Phylogenetic Trees

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Math 239: Discrete Mathematics for the Life Sciences Spring Lecture 14 March 11. Scribe/ Editor: Maria Angelica Cueto/ C.E.

Building Phylogenetic Trees UPGMA & NJ

Phylogeny: building the tree of life

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

A (short) introduction to phylogenetics

Phylogeny Tree Algorithms

molecular evolution and phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Inferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Gel Electrophoresis. 10/28/0310/21/2003 CAP/CGS 5991 Lecture 10Lecture 9 1

C.DARWIN ( )

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Lecture 10: Phylogeny

Plan: Evolutionary trees, characters. Perfect phylogeny Methods: NJ, parsimony, max likelihood, Quartet method

Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Phylogeny. November 7, 2017

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Improving divergence time estimation in phylogenetics: more taxa vs. longer sequences

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Dr. Amira A. AL-Hosary

Phylogenetic analysis. Characters

A Phylogenetic Network Construction due to Constrained Recombination

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

Hierarchical Clustering

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Comparative Genomics II

Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

Phylogenetic analyses. Kirsi Kostamo

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi)

Neighbor Joining Algorithms for Inferring Phylogenies via LCA-Distances

Phylogenetic inference: from sequences to trees

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Taxon: generally refers to any named group of organisms, such as species, genus, family, order, etc.. Node: represents the hypothetical ancestor

Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor

Bioinformatics course

Copyright 2000 N. AYDIN. All rights reserved. 1

Constructing Evolutionary Trees

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

Phylogenetic Networks, Trees, and Clusters

Supplementary Information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

! A species tree aims at representing the evolutionary relationships between species. ! Species trees and gene trees are generally related...

Phylogenetic Analysis and Intraspeci c Variation : Performance of Parsimony, Likelihood, and Distance Methods

An Adaptive Association Test for Microbiome Data

Sequence Analysis '17- lecture 8. Multiple sequence alignment

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family

Introduction to Bioinformatics

BIOINFORMATICS GABRIEL VALIENTE ALGORITHMS, BIOINFORMATICS, COMPLEXITY AND FORMAL METHODS RESEARCH GROUP, TECHNICAL UNIVERSITY OF CATALONIA

Phylogenetics. BIOL 7711 Computational Bioscience

Zhongyi Xiao. Correlation. In probability theory and statistics, correlation indicates the

Incremental Phylogenetics by Repeated Insertions: An Evolutionary Tree Algorithm

Computational approaches for functional genomics

Evolutionary Theory and Principles of Phylogenetics. Lucy Skrabanek ICB, WMC March 19, 2008

Chapter 27: Evolutionary Genetics

Computational methods for predicting protein-protein interactions

Module 13: Molecular Phylogenetics

Molecular Phylogenetics (Hannes Luz)

Homology. and. Information Gathering and Domain Annotation for Proteins

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Transcription:

Phylogenetic trees 07/10/13

A tree is the only figure to occur in On the Origin of Species by Charles Darwin. It is a graphical representation of the evolutionary relationships among entities that share a common ancestor.

The phylogenetic tree of a group of species does not necessarily reflect the phylogenetic tree of their host species (because of gene duplication, lateral gene transfer): Orthologs: Genes which diverged due to speciation Paralogs: Due to gene duplication Xenologs: Due to lateral gene transfer

Some Background All trees we consider will be binary the length of a branch, or edge indicates the amount of evolutionary divergence True biological trees have a root

Distance Matrix Not as bad as may seem methods from an initial look Remarkably little information is lost Introduced by Cavalli-Sforza and Edwards(1967) and Fitch and Margoliash (1967). Influenced by clustering algorithms General idea: Calculate a measure of distance between each pair of taxa Find a tree that predicts the observed set of distances as closely as possible

Distance methods Input: distance matrix between species Outline: Cluster species together Initially clusters are singletons At each iteration combine two closest clusters to get a new one

Unweighted Pair Group Method using Arithmetic Averages (UPGMA) Despite its formidable acronym, the method is simple and intuitively appealing. It works by clustering the sequences, at each stage combining two clusters and, at the same time, creating a new node on the tree. Thus, the tree can be imagined as being assembled upwards, each node being added above the others, and the edge lengths being determined by the difference in the heights of the nodes at the top and bottom of an edge.

An example showing how UPGMA produces a rooted phylogenetic tree 8

An example showing how UPGMA produces a rooted phylogenetic tree 9

An example showing how UPGMA produces a rooted phylogenetic tree 10

An example showing how UPGMA produces a rooted phylogenetic tree 11

An example showing how UPGMA produces a rooted phylogenetic tree 12

UPGMA 1. Initialize n clusters where each cluster i contains the sequence i 2. Find closest pair of clusters i, j, using distances in matrix D 3. Make them neighbors in the tree by adding new node (ij), and set distance from (ij) to i and j as Dij/2 4. Update distance matrix D: for all clusters k do the following (ni and nj are size of clusters i and j respectively) 5. Delete columns and rows for i and j in D and add new ones corresponding to cluster (ij) with distances as computed above 6. Goto step 2 until only one cluster is left

Limitations of UPGMA tree is ultrametric Evolution rate is constant in all branches

Neighbor joining 1. Initialization: same as UPGMA 2. For each species compute 3. Select i and j for which is minimum 4. Make them neighbors in the tree by adding new node (ij), and set distance from (ij) to i and j as 6. D i,(ij) = 1 2 (D i,j + u i u j ), D j,(ij) = 1 2 (D i,j + u j u i )

Neighbor joining 6. Update distance matrix D: for all clusters k do the following D (ij),k = 1 2 (D i,k + D j,k D i,j ) 7. Delete columns and rows for i and j in D and add new ones corresponding to cluster (ij) with distances as computed above 8. Go to 3 until two nodes/clusters are left