Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki

Similar documents
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Dr. Amira A. AL-Hosary

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

BINF6201/8201. Molecular phylogenetic methods

Phylogenetics: Building Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees

Phylogenetic inference

What is Phylogenetics

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

CSCI1950 Z Computa3onal Methods for Biology* (*Working Title) Lecture 1. Ben Raphael January 21, Course Par3culars

Phylogeny Fig Overview: Inves8ga8ng the Tree of Life Phylogeny Systema8cs

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Phylogenetic Tree Reconstruction

Multiple Sequence Alignment. Sequences

A (short) introduction to phylogenetics

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

Classification and Phylogeny

Mul$ple Sequence Alignment Methods. Tandy Warnow Departments of Bioengineering and Computer Science h?p://tandy.cs.illinois.edu

Ch. 26 Phylogeny BIOL 221

Classification and Phylogeny

How to read and make phylogenetic trees Zuzana Starostová

Phylogenetic analyses. Kirsi Kostamo

Constructing Evolutionary/Phylogenetic Trees


Lecture 6 Phylogenetic Inference

Basics on bioinforma-cs Lecture 7. Nunzio D Agostino

8/23/2014. Phylogeny and the Tree of Life

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Algorithms in Bioinformatics

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic Analysis

Phylogenetic Analysis

Phylogenetic Analysis

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Phylogeny. November 7, 2017

Chapter 19: Taxonomy, Systematics, and Phylogeny

Biology 211 (2) Week 1 KEY!

Classifica=on of Organisms. Chapter 7. A Taxonomic Hierarchy 8/31/18. Systema=cs or taxonomy. Animal Taxonomy, Phylogeny, and Organiza=on.

Macroevolution Part I: Phylogenies

Reconstructing the history of lineages

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses

Introduction to characters and parsimony analysis

Cladistics and Bioinformatics Questions 2013

C.DARWIN ( )

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

Classification, Phylogeny yand Evolutionary History

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Phylogenetics. BIOL 7711 Computational Bioscience

CSCI1950 Z Computa4onal Methods for Biology Lecture 5

Chapter 26 Phylogeny and the Tree of Life

PHYLOGENY & THE TREE OF LIFE

Theory of Evolution Charles Darwin

CSCI1950 Z Computa4onal Methods for Biology Lecture 4. Ben Raphael February 2, hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary

Lecture V Phylogeny and Systematics Dr. Kopeny

Phylogeny and the Tree of Life

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Phylogeny: building the tree of life

Sec$on 9. Evolu$onary Rela$onships

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them?

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Lecture 11 Friday, October 21, 2011

How should we organize the diversity of animal life?

The practice of naming and classifying organisms is called taxonomy.

Lab 06 Phylogenetics, part 1

ELE4120 Bioinformatics Tutorial 8

Chapter 26 Phylogeny and the Tree of Life

Phylogeny and the Tree of Life

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Letter to the Editor. Department of Biology, Arizona State University

Chapter 26. Phylogeny and the Tree of Life. Lecture Presentations by Nicole Tunbridge and Kathleen Fitzpatrick Pearson Education, Inc.

PHYLOGENY WHAT IS EVOLUTION? 1/22/2018. Change must occur in a population via allele

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Outline. Classification of Living Things

7. Tests for selection

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley

Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Evolu&on, Popula&on Gene&cs, and Natural Selec&on Computa.onal Genomics Seyoung Kim

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley

The Importance of Systema2cs and Rassenkreis. Reading: Willi Hennig

Molecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

Thanks to Paul Lewis and Joe Felsenstein for the use of slides

Chapter 26: Phylogeny and the Tree of Life

Biologists have used many approaches to estimating the evolutionary history of organisms and using that history to construct classifications.

Preliminaries. Download PAUP* from: Tuesday, July 19, 16

Quantifying sequence similarity

Evolutionary Tree Analysis. Overview

Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Basic Terminology. Looking at Trees. Basic Terminology.

Transcription:

Phylogene)cs IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, 2016 Joyce Nzioki

Phylogenetics The study of evolutionary relatedness of organisms. Derived from two Greek words:» Phle/Phylon: Tribe/Race» Genetikos: Relative to Birth

Phylogene7cs Evolu7on is the change in distribu7on of allele frequencies from one genera7on to the next. Similarity in sequenced data is taken as an indica7on of evolu7onary relatedness. Sequence difference is taken as a measure of evolu7onary divergence. Progression rules: as an organism is more distant from its ancestor their characters are more evolved.

Phylogene7cs Phylogeny can be drawn from molecular data (DNA/RNA/Proteins) or morphological data Taxonomy is informed from phylogene7cs Phylogene7c trees are graphical representa7on of the evolu7onary rela7onships through 7me or gene7c distance.

Aspects of phylogeny We will look at the major aspects of phylogenies and how they can be interpreted. 1. Topology 2. Branches 3. Nodes 4. Confidence

Phylogene7c tree This is a branching diagram that infers evolu7onary rela7onship of various species based on their physical or gene7c traits. Leaves/Operational taxonomic units (OTUs) Branches Internal nodes /Hypothetical taxonomic units (HTUs) Root

Topology branching structure of a tree The three top trees have common topologies. A and B are more closely to one another than to C. The bo[om box had different rela7onships

Alterna7ve representa7ve of phylogenies The same topology can be drawn in different ways the common format are shown above. Diagonal and rectangular formats commonly used for publica7on Curved format used in review papers to represent summaries of phylogenies Radial format is used to show un- rooted trees Circular format is used to represent large phylogenies.

Branches Branches show the path of transmission of gene7c informa7on from one genera7on to the next. Branch lengths indicate gene7c change i.e. the longer the branch, the more gene7c change has occurred. Informa7ve branch lengths are drawn to scale and indicate the number of subs7tu7ons per site A naive method of es7ma7ng the gene7c change is coun7ng the number of differences and dividing by the sequence length i.e. (1/10 = 0.1 ) though this doesn t account for mul7ple subs7tu7ons. To overcome this issues we use evolu7onary models to infer gene7c changes that occur

Nodes Tips /OTUs /external nodes the sequences sampled for phylogeny. Internal nodes occur at the point where more that one branch meet and represent the ancestral sequence / hypothe7cal common ancestor. The root represents the most recent common ancestor.

Rooted vs Un- rooted tree Un- rooted tree Does not show direc7on of evolu7on / ancestry Rooted tree Direc7on of evolu7on indicated as moving away from the root

Roo7ng a tree Two methods are known for tree rou7ng: 1. Outgroup Criteria: include in the analysis a group of sequences known as a priori to be external to the group in study; the root is by necessity the branch joining the outgroup and the other sequences 2. Midpoint roo)ng: this method makes the assump7on that all lineages are supported to have evolved with the same rate since divergence from their common ancestor. The root is at the equidistant point from all tree leaves.

Roo7ng a tree with an outgroup This is the use of an organism or group of organisms (outgroup) that are more evolu7onary distant to the group in study (internal group). The common ancestor is therefore placed between the internal group and the outgroup. This effec7vely roots the tree and evolu7onary distances will be rela7ve to this point. (gives a direc7on of evolu7on) Ingroup Outgroup

Selec7ng an outgroup An outgroup should not be too distantly related to the internal group, this results in very long branch lengths that distort the remaining branches rendering the topology unreliable. The outgroup should also not be too closely related to the internal group this may not make a true outgroup. Using various outgroup species may be[er balance the final tree branching.

Confidence Inferring phylogenies is inherently an uncertain process. Approaches used to es7mate our confidence in the inferred tree topology: bootstraps, likelihood and Bayesian approaches. This course we will focus on bootstrap. Typically these are shown on the phylogeny. Indica7ng the percentage confidence of a branch. Interpre7ng the exact meaning of confidence values is s7ll an area of debate but generally >50% is acceptable provided an appropriate evolu7onary model was used.

Bootstrapping Bootstrapping is commonly used test of reliability of inferred phylogene7c tree. A single tree may not be credible given the dependencies involved: (characters, evolu7onary model, parameters). Bootstrapping is done by genera7ng 100-1000 replicas of your data (arrange character posi7ons at random, to create a series of bootstrap samples of same size as original data) The bootstrap datasets are analyzed looking for consistency. Varia7on among the datasets is used to es7mate error involved in making es7mates in the original data

Interpre7ng pa[erns of relatedness Humans are more closely related to mice than lizard (share a more recent common ancestor). Frogs are more closely related to lizards than they are to fish (as they share a common ancestor with lizard more recent than fish). Fish are equally related to mice as they are to frogs. Mice and frogs both share the same common ancestor (black spot ) with fish. If you rotate the branches to change topology, the biological meaning remains.

Major stages in phylogene7c analysis

Building a phylogene7c tree Staring point :a set of homologous aligned DNA or protein sequences Quality of the alignment is essen7al, unreliable parts of the alignment are omi[ed Most methods only take into account subs7tu7ons gaps(dele7ons/inser7ons) are not used.

The gene compared must evolve at a rate comparable to the divergence 7me of the organism; for example: 18S rrna gene for phylum- level divergences since it evolved slowly Hemoglobin genes for mammalian orders. Mitochondrial DNA for species divergences within a genus Repe77ve DNA sequences (e.g. microsatellites) for individuals within a species

Method of building a tree 1. Distance methods 2. Character based methods 1. Maximum parsimony 2. Maximum likelihood 3. Bayesian inference

Distance methods Start from a mul7ple sequence alignment Make a matrix of pairwise distances Build a phylogene7c tree. E.g. Neighbor joining and UPGMA trees

Distance methods To es7mate evolu7onary distances between sequences there is need for sta7s7cal / evolu7onary models. Sta7s7cal models es7mate for evolu7onary distance while accoun7ng for residue subs7tu7on and homoplasy. Juke- Cantors: good for distances <10% Kimura- 2: distance 10-30% and transi7ons ~= transversions Tamura: distances 10-30% and strong G+C bias Jin- Nei γ: distance 10-30% and varying transi7on- transver7on rates Tajima- Nei: distances 30-100% These evolu7onary distances are then converted into a distance matrix used in building the tree

Character based methods This analyses any set of discrete character, that is each posi7on in an aligned sequence character. All character can be analyzed separately and independently of one another. These include: 1. Maximum Parsimony (MP) 2. Maximum Likelihood (ML) 3. Bayesian methods

Maximum Parsimony Parsimony involves evalua7ng all possible trees and giving each a score based on the number of evolu7onary changes that are needed to explain the observed data. The best tree is the one that requires the fewest base changes for all sequences to derive from a common ancestor

Maximum Likelihood Maximum Likelihood evaluates the topologies of different trees given a par7cular evolu7on model and picks the best one according to the likelihood score. (tree with the highest likelihood) It considers all characters and looks for trees that best suit a given evolu7on model. It is possibly more accurate than Maximum parsimony if the appropriate model is chosen.

summary UPGMA assumes molecular clock, so provides a rooted tree (this assump7on may be too strong in some cases) Neighbor joining has been proved to create correct trees when evolu7onary rates vary. Maximum Parsimony is good for closely related sequences Maximum likelihood methods is the general of all three.

Applications of Phylogenetics 1. Classifica)on phylogene7cs gives accurate pa[erns of relatedness that now inform the Linnaen classifica7on of new species. 2. Forensics - Access DNA evidence presented in court. 3. Origin of pathogens learn about new pathogen outbreak and source of transmission. 4. Conserva)on - can inform conserva7on policies of species becoming ex7nct.

Thanks Thanks to Anne Jores and EBI online training for the some of the slides presented