What is Phylogenetics

Similar documents
8/23/2014. Phylogeny and the Tree of Life

Classification, Phylogeny yand Evolutionary History

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Classification and Phylogeny

Phylogenetic Tree Reconstruction

Classification and Phylogeny

Phylogenetic Analysis

Algorithms in Bioinformatics

How should we organize the diversity of animal life?

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Phylogenetic Analysis

Phylogenetic Analysis

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

BINF6201/8201. Molecular phylogenetic methods

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them?

Macroevolution Part I: Phylogenies

Chapter 26 Phylogeny and the Tree of Life

Phylogeny and the Tree of Life

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

PHYLOGENY & THE TREE OF LIFE

Lecture 11 Friday, October 21, 2011

Introduction to characters and parsimony analysis

Dr. Amira A. AL-Hosary

Lecture V Phylogeny and Systematics Dr. Kopeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

Biology 211 (2) Week 1 KEY!

Reconstructing the history of lineages

C.DARWIN ( )

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

The practice of naming and classifying organisms is called taxonomy.

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Evolutionary Tree Analysis. Overview

Phylogeny and the Tree of Life

Chapter 26: Phylogeny and the Tree of Life

How to read and make phylogenetic trees Zuzana Starostová

Biology 2. Lecture Material. For. Macroevolution. Systematics

ELE4120 Bioinformatics Tutorial 8

Chapter 19: Taxonomy, Systematics, and Phylogeny

Chapter 26 Phylogeny and the Tree of Life

ESS 345 Ichthyology. Systematic Ichthyology Part II Not in Book

Lecture 6 Phylogenetic Inference

1/27/2010. Systematics and Phylogenetics of the. An Introduction. Taxonomy and Systematics

Chapter 19 Organizing Information About Species: Taxonomy and Cladistics

CLASSIFICATION OF LIVING THINGS. Chapter 18

Multiple Sequence Alignment. Sequences

AP Biology. Cladistics

PHYLOGENY AND SYSTEMATICS

(Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise

--Therefore, congruence among all postulated homologies provides a test of any single character in question [the central epistemological advance].

Phylogenetic analysis. Characters

Constructing Evolutionary/Phylogenetic Trees

Name: Class: Date: ID: A

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Fig. 26.7a. Biodiversity. 1. Course Outline Outcomes Instructors Text Grading. 2. Course Syllabus. Fig. 26.7b Table

Outline. Classification of Living Things

Name. Ecology & Evolutionary Biology 2245/2245W Exam 2 1 March 2014

Cladistics and Bioinformatics Questions 2013

Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Chapter 16: Reconstructing and Using Phylogenies

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Phylogenetic trees 07/10/13

CHAPTER 26 PHYLOGENY AND THE TREE OF LIFE Connecting Classification to Phylogeny


Phylogenetic inference

Unit 9: Evolution Guided Reading Questions (80 pts total)

Phylogeny and the Tree of Life

Phylogeny is the evolutionary history of a group of organisms. Based on the idea that organisms are related by evolution

Phylogenetics. BIOL 7711 Computational Bioscience

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Phylogeny: building the tree of life

Phylogeny & Systematics: The Tree of Life

9.3 Classification. Lesson Objectives. Vocabulary. Introduction. Linnaean Classification

Concept Modern Taxonomy reflects evolutionary history.

Phylogenetic methods in molecular systematics

BIOL 428: Introduction to Systematics Midterm Exam

Phylogeny and the Tree of Life

Modern Evolutionary Classification. Section 18-2 pgs

The Tree of Life. Phylogeny

BIOLOGY. Phylogeny and the Tree of Life CAMPBELL. Reece Urry Cain Wasserman Minorsky Jackson

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki

CLASSIFICATION. Why Classify? 2/18/2013. History of Taxonomy Biodiversity: variety of organisms at all levels from populations to ecosystems.

AP Biology Notes Outline Enduring Understanding 1.B. Big Idea 1: The process of evolution drives the diversity and unity of life.

Homology and Information Gathering and Domain Annotation for Proteins

Workshop: Biosystematics

Phylogeny and the Tree of Life

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Biology 1B Evolution Lecture 2 (February 26, 2010) Natural Selection, Phylogenies

Need for systematics. Applications of systematics. Linnaeus plus Darwin. Approaches in systematics. Principles of cladistics

Transcription:

What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features) of the species, under the natural assumption that similar species (i.e., species with similar characters) are genetically close. The term phylogeny refers to these relationships, usually presented as a phylogenetic tree Classic phylogenetics dealt mainly with physical, or morphological features - size, color, number of legs, etc. Modern phylogeny uses information extracted from genetic material - mainly DNA and protein sequences. The characters used are usually the DNA or protein sites (a site means a single position in the sequence) after aligning several such sequences, and using only blocks which were conserved in all the examined species. An interesting example is a research project that used phylogenetics in order to trace the origins of the human population on earth. During evolution, it is very common for a gene to be duplicated. Therefore, when discussing matching genes in different species, we differentiate between orthologous matches - which means both genes are ``the same'' gene in the strong sense - they are connected directly, and not through a duplication or sequences diverged after a speciation event paralogous matches - which are the result of some duplication along the evolutionary line, xenologs (horizontal transfers) which are genes that are transfered between organisms in other ways (e.g., by virus). Therefore, if we base our analysis on paralogs or xenologs (rather than orthologs) we are in big trouble.

Theory of Evolution Basic idea speciation events lead to creation of different species. Speciation caused by physical separation into groups where different genetic variants become dominant Any two species share a (possibly distant) common ancestor

Basic Assumptions Closer related organisms have more similar genomes. Highly similar genes are homologous (have the same ancestor). A universal ancestor exists for all life forms. Molecular difference in homologous genes (or protein sequences) are positively correlated with evolution time. Phylogenetic relation can be expressed by a dendrogram (a tree ).

Primate evolution Speciation events A phylogeny is a tree that describes the sequence of speciation events that lead to the forming of a set of current day species; also called a phylogenetic tree.

Molecular clock This phylogenetic tree has all leaves in the same level. When this property holds, the phylogenetic tree is said to satisfy a molecular clock. Namely, the time from a speciation event to the formation of current species is identical for all paths (wrong assumption in reality).

Phylogenetic tree The results of phylogenetic analysis are usually presented as a collection of nodes and branches and that is, a tree In such tree, taxa that are closely related in an evolutionary sense appear close to each other, and taxa that are distantly related are in different (far) branches of the trees A phylogenetic tree or evolutionary tree is a branching diagram or "tree" showing the inferred evolutionary relationships amtong various biological species or other entities based upon similarities and differences in their physical and/or genetic characteristics. In a phylogenetic tree, every node represents a species. Nodes are labeled, either with species names or the values (also referred to as states) of their characters, and the edges represent the genetic connections. It is important to note that there is usually a big difference between the leaf nodes, that represent real species, and the internal nodes, that in most cases represent the hypothetical evolutionary ancestors of the species in the data. The taxa joined together in the tree are implied to have descended from a common ancestor. Trees are useful in fields of biology such as bioinformatics, systematics, and comparative phylogenetics.

History Types Rooted tree Unrooted tree Bifurcating tree Special tree types Construction Limitations Contents

History The idea of a "tree of life" arose from ancient notions of a ladder-like progression from lower to higher forms of life. Early representations of "branching" phylogenetic trees include a "paleontological chart" showing the geological relationships among plants and animals in the book Elementary Geology, by Edward Hitchcock. Charles Darwin (1859) also produced one of the first illustrations and crucially popularized the notion of an evolutionary "tree" in his seminal book The Origin of Species. Over a century later, evolutionary biologists still use tree diagrams to depict evolution because such diagrams effectively convey the concept that speciation occurs through the adaptive and random splitting of lineages. Over time, species classification has become less static and more dynamic.

Rooted tree (A rooted tree is a directed tree in which one of the nodes is stipulated to be the root which is the most ancient hypothetical common ancestor of the OTUs being compared, and thus the direction of ancestral relationships is determined) In a rooted phylogenetic tree, each node with descendants represents the inferred most recent common ancestor of the descendants, and the edge lengths in some trees may be interpreted as time estimates. Each node is called a taxonomic unit. Internal nodes are generally called hypothetical taxonomic units, as they cannot be directly observed. Rooting an unrooted tree involves inserting a new node, which will function as the root node. This can be done by introducing an outgroup, a species that is definitely distant from all the species of interest. The proposed root will be the direct predecessor of the outgroup.

The tips of the branches (terminal nodes) represent the sequences being compared (sometimes called operational taxonomic units or OTUs) or Leaves represent present day species ortaxonomical units for which we want to create phylogeny are called Objects e.g. species, population Every object has a state vector & inherit the same characters but not the same states! The nodes connecting the branches (internal nodes) represent hypothetical common ancestors of the OTUs that the branches subtend. Edges length - time from one speciation to the next Branch lengths may have meaning in radial diagrams and phylograms. They may represent the calculated distances between nodes if distance algorithms are used. They may represent the minimum number of steps between nodes if parsimony algorithms are used.

An unrooted tree has no pre-determined root and therefore induces no hierarchy. Thus illustrate the relatedness of the leaf nodes without making assumptions about ancestry at all Therefore, in this case, the distance between the nodes should be symmetric or specifies only the nodes interrelations (since the tree edges are not directed). While unrooted trees can always be generated from rooted ones by simply omitting the root, a root cannot be inferred from an unrooted tree without some means of identifying ancestry; this is normally done by including an outgroup in the input data or introducing additional assumptions about the relative rates of evolution on each branch. An outgroup is a species that have unambiguously separated early from the other species being considered. Example: comparing Humas and Gorilas, Baboons could be used as outgroups and the root would be placed somewhere along the branch conecting Baboons to the common ancestors for Humans and Gorilas. A radial diagram is particularly useful when the tree is unrooted or the root is uncertain.

Unrooted tree represents the same phylogeny without the root node

Rooted versus unrooted trees Tree A Tree B b a c

Bifurcating tree Both rooted and unrooted phylogenetic trees can be either bifurcating or multifurcating, and either labeled or unlabeled. A rooted bifurcating tree has exactly two descendants arising from each interior node (that is, it forms a binary tree), and an unrooted bifurcating tree takes the form of an unrooted binary tree, a free tree with exactly three neighbors at each internal node. In contrast, a rooted multifurcating tree may have more than two children at some nodes and an unrooted multifurcating tree may have more than three neighbors at some nodes. A labeled tree has specific values assigned to its leaves, while an unlabeled tree, sometimes called a tree shape, defines a topology only. The number of possible trees for a given number of leaf nodes depends on the specific type of tree, but there are always more multifurcating than bifurcating trees, more labeled than unlabeled trees, and more rooted than unrooted trees. The last distinction is the most biologically relevant; it arises because there are many places on an unrooted tree to put the root.

For labeled bifurcating trees, there are: total rooted trees and total unrooted trees

The bifurcating tree A tree that bifurcates has a maximum of 2 descendants arising from each of the interior nodes.

The multi-furcating tree A tree that multi-furcates has multiple descendants arising from each of the interior nodes.

Special tree types A dendrogram is a broad term for the diagrammatic representation of a phylogenetic tree. A cladogram is a phylogenetic tree formed using cladistic methods. This type of tree only represents a branching pattern; i.e., its branch spans do not represent time or relative amount of character change. A cladogram (slanted or rectangular) places all OTUs equidistant from the root. In taxonomy, OTUs on all branches with a common ancestor are called a clade. A taxonomic unit (species, genus, family, etc.) is said to be monophyletic if the smallest clade containing all members of that unit does not contain members of another unit. A taxonomic unit is said to be polyphyletic if the smallest clade containing all members of that unit contains members of other units. A phylogram is a phylogenetic tree that has branch spans proportional to the amount of character change. A phylogram allows variation in the distance of OTUs from the root. A chronogram is a phylogenetic tree that explicitly represents evolutionary time through its branch spans.

A monophyletic group = CLADE

Construction Distance-matrix methods A tree that recursively combines two nodes of the smallest distance. calculate genetic distance from multiple sequence alignments, simplest to implement, do not invoke an evolutionary model. such as neighbor-joining or UPGMA, (Unweighted Pair Group Method using Arithmetic Averages) and Fitch Margoliash Many sequence alignment methods such as ClustalW also create trees by using the simpler algorithms (i.e. those based on distance) of tree construction. Maximum parsimony is another simple method of estimating phylogenetic trees, but implies an implicit model of evolution (i.e. parsimony) or A tree with a total minimum number of character changes between nodes.. More advanced methods use the optimality criterion of maximum likelihood, often within a Bayesian Framework, and apply an explicit model of evolution to phylogenetic tree estimation.the method of choice nowadays. Most known and useful software called phylip uses this method. Identifying the optimal tree using many of these techniques is NP-hard,so heuristic search and optimization methods are used in combination with tree-scoring functions to identify a reasonably good tree that fits the data.

Terminology for character states The following terms, coined by Hennig, are used to identify shared or distinct characters among groups: [ A plesiomorphy ("close form") or ancestral state is a character state that a taxon has retained from its ancestors. When two or more taxa that are not nested within each other share a plesiomorphy, it is a symplesiomorphy (from syn-, "together") of theirs. Symplesiomorphies do not mean that the taxa that have them are necessarily closely related. For example, Reptilia is traditionally characterized by (among other things) being cold-blooded (i.e. not maintaining a constant high body temperature), whereas birds are warm-blooded. Since cold-bloodedness is a plesiomorphy, inherited from the common ancestor of traditional reptiles and birds, and thus a symplesiomorphy of turtles, snakes and crocodiles (among others), it does not mean that turtles, snakes and crocodiles form a clade that excludes the birds. An apomorphy ("separate form") or derived state is an innovation. It can thus be used to diagnose a clade or even to define a clade name in phylogenetic nomenclature. One clade may have autapomorphies (from auto-, "self"), two sister-groups may have synapomorphies (from syn-, "together"). For example, the possession of digits that are homologous with those of Homo sapiens is an apomorphy within the vertebrates. The tetrapods can be singled out as consisting of the first vertebrate with such digits together with all descendants of this vertebrate (an apomorphy-based phylogenetic definition). [19] Importantly, snakes and other tetrapods that do not have digits are nonetheless tetrapods: they descend from ancestors that possessed digits which were homologous with ours. A character state is homoplastic or "a homoplasy" if it is shared by two or more organisms but was not present in their common ancestor. It has evolved by convergence or reversion. Both mammals and birds are able to maintain a high constant body temperature (i.e. they are 'warm-blooded'). However, the ancestors of each group did not share this character, so it must have evolved independently. Warm-bloodedness is separately an apomorphy of mammals (or a larger clade) and one of birds (or a larger clade), but it is not a synapomorphy of these two clades.

The terms plesiomorphy and apomorphy are relative; their application depends on the position of a group within a tree. An (aut)apomorphy of one clade is a plesiomorphy of each of its members. For example, when trying to decide whether the tetrapods form a clade, an important question is whether having four limbs is a synapomorphy of all the taxa to be included within Tetrapoda: did all the possible members of the Tetrapoda inherit four limbs from a common ancestor, whereas all other vertebrates did not? By contrast, for a group within the tetrapods, such as birds, having four limbs is a plesiomorphy. Using these two terms allows a greater precision in the discussion of homology, in particular allowing clear expression of the hierarchical relationships among different homologies. It can be difficult to decide whether a character is in fact the same and thus can be classified as a synapomorphy which may identify a monophyletic group or whether it only appears to be the same and is thus a homoplasy which cannot identify such a group. There is a danger of circular reasoning: assumptions about the shape of a phylogenetic tree are used to justify decisions about characters, which are then used as evidence for the shape of the tree. [20] Phylogenetics uses various forms of parsimony to decide such questions; but the solutions often depend on the dataset and the methods.

The Importance of Phylogenetic Trees 1. Increasing use of phylogenetic trees in the biological sciences 2. Need to know what trees diagrams do and do not communicate 3. Provide an efficient structure for organizing biodiversity info 4. Develop accurate conception of totality of evolutionary history 5. Important for aspiring biologists to develop this understanding

Limitations They do not necessarily accurately represent the species evolutionary history. The data on which they are based is noisy; the analysis can be confounded by horizontal gene transfer, hybridisation between species that were not nearest neighbors on the tree before hybridisation takes place, convergent evolution, and conserved sequences.

Also, there are problems in basing the analysis on a single type of character, such as a single gene or protein or only on morphological analysis, because such trees constructed from another unrelated data source often differ from the first, and therefore great care is needed in inferring phylogenetic relationships among species. This is most true of genetic material that is subject to lateral gene transfer and recombination, where different haplotype blocks can have different histories. In general, the output tree of a phylogenetic analysis is an estimate of the character's phylogeny (i.e. a gene tree) and not the phylogeny of the taxa (i.e. species tree) from which these characters were sampled, though ideally, both should be very close. For this reason, serious phylogenetic studies generally use a combination of genes that come from different genomic sources (e.g., from mitochondrial or plastid vs. nuclear genomes), or genes that would be expected to evolve under different selective regimes, so that homoplasy (false homology) would be unlikely to result from natural selection

When extinct species are included in a tree, they are terminal nodes, as it is unlikely that they are direct ancestors of any extant species..