Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Similar documents
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Dr. Amira A. AL-Hosary

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Constructing Evolutionary/Phylogenetic Trees

Phylogenetic Tree Reconstruction

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Constructing Evolutionary/Phylogenetic Trees

Evolutionary Tree Analysis. Overview

C.DARWIN ( )

BINF6201/8201. Molecular phylogenetic methods


Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

What is Phylogenetics

EVOLUTIONARY DISTANCES

Chapter 26 Phylogeny and the Tree of Life

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Phylogenetics. BIOL 7711 Computational Bioscience

Theory of Evolution Charles Darwin

Algorithms in Bioinformatics

Phylogeny Tree Algorithms

How to read and make phylogenetic trees Zuzana Starostová

Phylogenetic analyses. Kirsi Kostamo

Phylogenetics: Building Phylogenetic Trees

Phylogeny: building the tree of life

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

The practice of naming and classifying organisms is called taxonomy.

Phylogenetic inference

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

Introduction to characters and parsimony analysis

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

8/23/2014. Phylogeny and the Tree of Life

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

A (short) introduction to phylogenetics

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Multiple Sequence Alignment. Sequences

Phylogenetic inference: from sequences to trees

A Phylogenetic Network Construction due to Constrained Recombination

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular Evolution, course # Final Exam, May 3, 2006

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

Phylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics

Effects of Gap Open and Gap Extension Penalties

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

Lecture 6 Phylogenetic Inference

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

How should we organize the diversity of animal life?

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki

Lecture 11 Friday, October 21, 2011

Consensus methods. Strict consensus methods

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Theory of Evolution. Charles Darwin

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Phylogenetic trees 07/10/13

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley. Parsimony & Likelihood [draft]

Quantifying sequence similarity

Phylogenetic Analysis

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Phylogenetic Analysis

Phylogenetic Analysis

CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses

Molecular Evolution and Phylogenetic Tree Reconstruction

Biology 211 (2) Week 1 KEY!

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them?

Phylogenetic Tree Generation using Different Scoring Methods

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Inferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies

Sequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Organizing Life s Diversity

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

Concepts and Methods in Molecular Divergence Time Estimation

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Plan: Evolutionary trees, characters. Perfect phylogeny Methods: NJ, parsimony, max likelihood, Quartet method

Phylogenetic Networks, Trees, and Clusters

Phylogeny: traditional and Bayesian approaches

Maximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

Macroevolution Part I: Phylogenies

Classification and Phylogeny

Chapter 19: Taxonomy, Systematics, and Phylogeny

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Chapter 16: Reconstructing and Using Phylogenies

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

CHAPTER 26 PHYLOGENY AND THE TREE OF LIFE Connecting Classification to Phylogeny

Molecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences

Lecture Notes: Markov chains

Phylogeny and the Tree of Life

Transcription:

Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz

Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels with Genetic Algorithms (time permitting)

Definition: Phylogenetic Tree This is a REAL tree, not to be confused with a graphical representation A tree (graphical representation) that shows evolutionary relationships based upon common ancestry. Describes the relationship between a set of objects (species or taxa). [Israngkul]

Phylogenetic Tree Finding a tree like structure that defines certain ancestral relationships between a related set of objects. [Reijmers et al, 1999] Composed of branches/edges and nodes Can be gene families, single gene from many taxa, or combination of both. [Baldauf, 2003]

Terminology Branches connections between nodes Evolutionary tree patterns of historical relationships between the data Leaves terminal node; taxa at the end of the tree Nodes represent the sequence for the given data; Internal nodes correspond to the hypothetical last common ancestor of everything arising from it. [Baldauf, 2003] Taxa (a car for hire) individual groups Tree (number after two) - mathematical structure consisting of nodes which are connected by branches

Rooted vs Unrooted Rooted tree Directed tree Has a path Accepted common ancestor Doesn't blow over in the wind Unrooted Tree Typical results Unknown common ancestor Common in Arizona, blowing around plains

Rooted Phylogenetic Tree

Unrooted Phylogenetic Tree

Combinatorial Explosion Number of topologies = Product i =3 to x (2i 5) Ten objects yields 2,027,025 possible trees 25 objects yields about 2.5 x 10 28 trees Branches 2x -3 branches X are peripheral (X - 3) are interior

Why Do We Do It? Understand evolutionary history Show visual representation of relationships and origins Healthcare Origins of diseases Show how they are changing Help produce cures and vaccines Model allows calculations to determine distances Forensics Our history

Our Family

Terminal Node

Evolution Theory that groups of organisms change over time so that descendants differ structurally and/or functionally from their ancestors. [Pevsner, 2003] Biological process by which organisms inherit morphological and physiological features than define a species. [Pevsner, 2003] Biological theory that postulating that the various types of animals and plants have their origin in other preexisting types and that distinguishable differences are due to modifications in successive generations. [Encyclopaedia Britannica, 2004]

Example

Ways to do it Parsimony Maximum Likelihood Estimator Distance based methods Clustering Genetic algorithm * While there are numerous methods, these are among the most popular

Parsimony

Character based Parsimony Search for tree with fewest number character changes that account for observed differences Best one has the least amount of evolutionary events required to obtain the specific tree Advantages: Simple, intuitive, logical, and applicable to most models Can be used on a wide variety of data More powerful approach than distance to describe hierarchical relationship of genes & proteins (Pevsner, 2003) Disadvantages; No mathematical origins Fooled by same multiple or circuitous changes

Parsimony Methods The best tree is the one that minimizes the total number of mutations at all sites [Israngkul] The assumption of physical systematics is that genes exist in a nested hierarchy of relatedness and this is reflected in a hierarchical distribution of shared characters in the sequence. (Pevsner, 2003)

Maximum Likely Hood

Maximum Likelihood Estimator Tree with highest probability of evolving from given data Mathematical Process Complex Math many have problems with this Advantages: Can be used for various types of data including nucleotides and amino acids Usually the most consistent Disadvantages: Computationally intense Can be fooled by multiple or circuitous changes

Distance Based Methods Uses distances between leaves Upper triangular matrix of distances between taxa Percent similarity Metric Number of changes Distance score Produces edge weighted tree Least squares error Can use Matlab

Hamming distance Distances n = # sites different N = alignment length D = 100% x (n/n) ignore information of evolutionary relationship Jukes-Cantor D = -3/4 ln (1-4P/3) Kimura Transitions more likely than transversions Transitions given more weight

Distances The walk from the parking lot Now that's far!

Least Squares Start with distance matrix Pick tree type to start Calculate distances to minimize SSE Try other trees Time consuming Exhaustive search will yield optimal tree, but also may take l-o-n-g time

Example of Distance Matrix

Clustering Genetic algorithm Neighbor joining UPGMA PAUP contains the last two

Neighbor joining Clustering Uses distances between pairs of taxa i.e. Number of nucleotide differences Not individual characters Builds shortest tree by complex methods UPGMA Unweighted Pair Group Method w/ Arithmetic Mean Starts with first two most similar nodes Compares this average/composite to the next Never uses original nodes again

Summary Real life is usually not the optimum tree The best model is one that is obtained by several methods

Comments Sometimes difficult because Do not have complete fossil record Parallel evolution Character reversals Circuitous changes Bifurcating vs polytomy split No animals, plants, or robots were hurt in the making of this presentation

References Baldauf, S.: 2003, Phylogeny for the faint of heart: a tutorial, Trends in Genetics, 19(6) pp. 345-351 Encyclopaedia Brittanica; 2004, The Internet http://www.brittannica.com Futuyma, D; 1998, Evolutionary Biology, Third Edition, Sinauer Associates, Sunderland MA Israngkul, W; 2002, Algorithms for Phylogenetic Tree Reconstruction, the Internet: biohpc.learn.in.th/files/contents2003/worawit/ Algorithms Opperdoes, F., 1997 Construction of a Distance Tree Using Clustering with the UPGMA, the Internet : http://www.icp.ucl.ac.be/~opperd/private/upgma.html Pevsner, J.; 2003, Bioinformatics & Functional Genomics, Wiley, Hoboken NJ Reijmers, T., Wehrens, R., Daeyaert, F., Lewi, P., & Buydens, L.; 1999, Using genetic algorithms for the construction of phylogenetic trees: application to G- protein coupled receptor sequences, BioSystmes (49) pp. 31-43 Renaut, R., 2004 Computational BioSciences Class, CBS-520, Arizona State University

Genetic Algorithms Optimizing distance clustering (Reijmers et al) Optimal Distance method Not guaranteed the most optimal, only near optimal Exhaustive exploration not guaranteed Same solution may be checked multiple times Simulated time evolution (Brad's project for winter break)