Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Basic Terminology. Looking at Trees. Basic Terminology.

Similar documents
C3020 Molecular Evolution. Exercises #3: Phylogenetics

8/23/2014. Phylogeny and the Tree of Life

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

How should we organize the diversity of animal life?

Reconstructing the history of lineages

Phylogenetic Analysis

Classification, Phylogeny yand Evolutionary History

Lecture 6 Phylogenetic Inference

Phylogenetic Analysis

Phylogenetic Analysis

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them?

Classification and Phylogeny

Classification and Phylogeny

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Lecture V Phylogeny and Systematics Dr. Kopeny

ELE4120 Bioinformatics Tutorial 8

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

AP Biology. Cladistics

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses

Biology 211 (2) Week 1 KEY!

Dr. Amira A. AL-Hosary

Chapter 26 Phylogeny and the Tree of Life

Lecture 11 Friday, October 21, 2011

What is Phylogenetics

PHYLOGENY & THE TREE OF LIFE

Macroevolution Part I: Phylogenies

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Phylogeny is the evolutionary history of a group of organisms. Based on the idea that organisms are related by evolution

Chapter 26 Phylogeny and the Tree of Life

1/27/2010. Systematics and Phylogenetics of the. An Introduction. Taxonomy and Systematics

Chapter 26: Phylogeny and the Tree of Life

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

BINF6201/8201. Molecular phylogenetic methods

Introduction to characters and parsimony analysis

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley

Chapter 16: Reconstructing and Using Phylogenies

PHYLOGENY AND SYSTEMATICS

Biologists have used many approaches to estimating the evolutionary history of organisms and using that history to construct classifications.

Phylogeny and the Tree of Life

Phylogeny and the Tree of Life

Lab 06 Phylogenetics, part 1

Phylogeny and Systematics

Theory of Evolution Charles Darwin

Name. Ecology & Evolutionary Biology 2245/2245W Exam 2 1 March 2014

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Need for systematics. Applications of systematics. Linnaeus plus Darwin. Approaches in systematics. Principles of cladistics

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

ESS 345 Ichthyology. Systematic Ichthyology Part II Not in Book

Chapter 26. Phylogeny and the Tree of Life. Lecture Presentations by Nicole Tunbridge and Kathleen Fitzpatrick Pearson Education, Inc.

URRY CAIN WASSERMAN MINORSKY REECE What kind of organism is this? 2016 Pearson Education, Inc Pearson Education, Inc.

Phylogeny CAMPBELL BIOLOGY IN FOCUS SECOND EDITION URRY CAIN WASSERMAN MINORSKY REECE

Name: Class: Date: ID: A

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Phylogenetic inference

Chapter 22: Descent with Modification 1. BRIEFLY summarize the main points that Darwin made in The Origin of Species.

Chapter 19: Taxonomy, Systematics, and Phylogeny

Unit 7: Evolution Guided Reading Questions (80 pts total)

Biology 1B Evolution Lecture 2 (February 26, 2010) Natural Selection, Phylogenies

Unit 9: Evolution Guided Reading Questions (80 pts total)

CHAPTER 26 PHYLOGENY AND THE TREE OF LIFE Connecting Classification to Phylogeny

Reconstructing Evolutionary Trees. Chapter 14

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

Patterns of Evolution

CLASSIFICATION OF LIVING THINGS. Chapter 18

Theory of Evolution. Charles Darwin

Phylogeny and the Tree of Life

Classifications can be based on groupings g within a phylogeny

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

Historical Biogeography. Historical Biogeography. Systematics

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

1. Construct and use dichotomous keys to identify organisms.

Phylogeny and the Tree of Life

Phylogenetic analyses. Kirsi Kostamo

Biology 2. Lecture Material. For. Macroevolution. Systematics

Integrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2008


The practice of naming and classifying organisms is called taxonomy.

Inferring Molecular Phylogeny

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Algorithms in Bioinformatics

Speciation. Today s OUTLINE: Mechanisms of Speciation. Mechanisms of Speciation. Geographic Models of speciation. (1) Mechanisms of Speciation

AP Biology Notes Outline Enduring Understanding 1.B. Big Idea 1: The process of evolution drives the diversity and unity of life.

(Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise

Phylogenetics Todd Vision Spring Some applications. Uncultured microbial diversity

BIOLOGY. Phylogeny and the Tree of Life CAMPBELL. Reece Urry Cain Wasserman Minorsky Jackson

Outline. Classification of Living Things

20 Phylogeny CAMPBELL BIOLOGY IN FOCUS. Urry Cain Wasserman Minorsky Jackson Reece. Lecture Presentations by Kathleen Fitzpatrick and Nicole Tunbridge

C.DARWIN ( )

Chapter 27: Evolutionary Genetics

A Phylogenetic Network Construction due to Constrained Recombination

Integrating Fossils into Phylogenies. Throughout the 20th century, the relationship between paleontology and evolutionary biology has been strained.

Cladistics and Bioinformatics Questions 2013

Constructing Evolutionary/Phylogenetic Trees

Phylogeny & Systematics: The Tree of Life

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

Transcription:

Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Phylogeny: asic erminology Outline: erminology Phylogenetic tree: Methods Problems parsimony maximum likelihood bootstrapping homoplasy hybridization Wayne Maddison Sean raham 1 ips represent taxa (usually extant) Nodes represent hypothesized common ancestors Root is the oldest common ancestor on a rooted tree ranches represent time or amount of change between nodes or nodes and tips (but length is often arbitrary) 2 asic erminology Looking at rees Rooted trees typically have one or more outgroups. n outgroup represents a group that diverged before the diversification of the group of interest. Outgroups tell us about the direction of change within the ingroup (the ingroup is the group under study). Rooted trees have a root, and nodes closer to the root represent older divergences than nodes near the tips. he groups on either side of a node (sister taxa) are considered of equal age. hese two trees show the same relationships, but the unrooted tree makes no claims about which of the divergences is oldest. n unrooted tree could potentially be rooted by any if its nodes. an you draw a rooted tree using one of the roots within the red group? Note: there is more than one way to depict a set of relationships. e careful not to over interpret the orientation of the branches. 4

o these phylogenies show the same relationships? mphibians irds rocodiles Snakes Lizards urtles Mammals mphibians Mammals urtles Lizards Snakes rocodiles irds Looking at rees 5 he branch lengths on phylogenetic trees may or may not not be proportional to the amount of change along their length. 6 Interpreting roupings Interpreting ranch Lengths ladogram Figure 14.1 hese terms are used to compare named entities (e.g. fished, mammals, etc.) to grouping found in phylogenetic trees Monophyletic roup or lade E Paraphyletic roup Phylogram If branch lengths are proportional to change, the tips will not be neatly lined up, and a scale should be included. E E Polyphyletic group 1 nucleotide change 8

What is the relationship between taxonomic names and phylogenetic groups? What is the relationship between taxonomic names and phylogenetic groups? mniotes Reptiles irds rocodiles Snakes Lizards urtles Mammals mphibians irds rocodiles Snakes Lizards urtles mnion old looded 9 10 What is the relationship between taxonomic names and phylogenetic groups? urtles Wings Lizards Snakes rocodiles irds ats Rodents mphibians 11 lder n example of a polyphyletic group: mentiferae Walnut Willow ll of these trees have highly reduced male flowers clustered into structures called catkins. hese specialized structures were previously thought to reflect close relationships among the trees that have them. herefore, the families of trees with catkins were grouped into the mentiferae However, it turns out that catkins are adaptations to wind pollination, that reflect common selection, not common 12 history

Willows n example of a polyphyletic group: mentiferae Walnuts Oaks What is the relationship between taxonomic names and phylogenetic groups? re these groups monophyletic, paraphyletic or polyphyletic? fish? tetrapods? (= four limbed) Evolution of catkins amphibians? mammals? ncestor with separate flowers 1 Vertebrate Phylogeny ectotherms (= warm blooded)? 14 Reconstructing Evolutionary rees I. istance Methods (phenetics) he development of methods: I. distance methods (UPM, Neighbor joining) II. parsimony methods III. maximum likelihood (IV.) ayesian inference istance methods grew out of the school of numerical taxonomy, which had its heyday in the 1960s. axonomists were looking for more rigorous methods of developing classifications and inferring relationships. he idea was to use total information, measuring many characters and producing a summary of what the characters suggest about groupings based on overall similarity. hese approaches were also practical when molecular datasets started to get very large, and for a time outpaced computer processing power. 15 16

I. istance Methods (phenetics) I. istance Methods (phenetics) Example 1: morphology Example 1: morphology Overall istance matrix rait 2 istance matrix rait 2 1.0.0. 4.9.0.0 rait 1 1 1.0.0. 4.9.0.0 rait 1 18 I. istance Methods (phenetics) istance methods with sequence data Example 1: morphology istance matrix rait 2 istance matrix : : : : 1.0.0. 4.9.0.0 rait 1 19 1 5 20

istance methods with sequence data 1 5 istance methods with sequence data 1 5 6 New istance matrix: take averages 6 21 22 istance methods with sequence data 1 5 Strengths and weaknesses of distance methods dvantages - Intuitive, easy to understand - Works all all sorts of data, alone or in combination - Fast implementation on large data sets - an handle very large data sets easily 6 2 isadvantages - Must assume that similarity reflects shared evolutionary history (when is this most problematic?) 24

II. Parsimony Methods (ladistics) pplying parsimony Methods originally developed by Willi Hennig (erman entomologist), presented in a book published in 1966 ranslated into English in 196; very influential Originally important in analysis of small morphological data sets, including those from fossils hese methods came to the forefront with the application of N sequencing technology to systematics (early 1990s). In the early days, the methods were tough to implement because of limitations in computer processor speed (still somewhat limiting at times, because data sets keep getting larger). 25 onsider four taxa (1-4) and four characters (-) ncestral state: abcd axon 1 2 4 a a a a rait b b b b c c c c d d d d 26 pplying parsimony pplying parsimony onsider four taxa (1-4) and four characters (-) ncestral state: abcd Unique changes axon rait 1 a b c d 2 a b c d a b c d 4 a b c d onvergences or reversals 1 2 4 a bcd a b c d a b c d a b cd c d c b a abcd 5 steps 2 onsider four taxa (1-4) and four characters (-) ncestral state: abcd Unique changes axon rait 1 a b c d 2 a b c d a b c d 4 a b c d onvergences or reversals 1 4 2 a bcd a b cd a b c d a b c d d c b a abcd 4 steps 28

Strengths and weaknesses of parsimony Parsimony practice haracters Strengths - straightforward to calculate the length of the tree (number of steps) - Simulation studies have shown that parsimony algorithms are reliable under a range of conditions - onceptually simple; satisfying Weaknesses - annot easily accommodate complex models of evolutionary change (e.g. in which rates of evolutionary change differ among branches) - Under certain circumstances, can be positively misleading 29 axa 1 2 4 5 6 K L M N Which unrooted tree is most parsimonious? L N L M L 2 M 2 K K N N Plot each change on each tree. Positions 1 and 2 are done. Which positions help to determine relationships? 2 K M 0 Inferring the direction of evolution Where did the mutation occur, and what was the change? Mouse (outgroup) Orangutan orilla Human III. Maximum Likelihood Methods (and ayesian analysis as currently used) Maximum likelihood approaches involve using a specific model to determine the probability that a particular base substitution will occur along a particular branch on a tree. In effect the question being addressed is: what is the probability of the observed data given a particular tree and a particular model of substitution? onobo 1 himp he best tree is the one with the highest probability of explaining the observed data, given the model 2

ransversions Maximum likelihood: a simple model Probabilities: transition: 0.2 transversion: 0.1 no change 0. ransitions SK: Find the tree with the highest probability ransversions ransitions Maximum likelihood: a simple model Probabilities: transition: 0.2 transversion: 0.1 no change 0. SK: Find the tree with the highest probability P 1 = (.)(.1)(.2)(.)(.) 4 ransversions Maximum likelihood: a simple model Probabilities transition: 0.2 transversion: 0.1 no change 0. ransitions SK: Find the tree with the highest probability = P1 x P2 x P P1 = (.)(.1)(.2)(.)(.) P2 = (.)(.1)(.)(.)(.) P = (.1)(.2)(.)(.)(.2) 5 More complex likelihood models.. Likelihood models can be quite complesm and different models assign different probabilities to changes, including: Relative probabilities of transitions and transversions Variation in mutation rates across sites (e.g. by codon position in protein coding genes) or regions (intron versus exon versus spacers) Variation in mutation rates across lineages. 6

ssessment of Maximum Likelihood (also ayesian) Strengths Highly flexible (any model can be used) - Statistically justifiable - given enough data (and the right model), will always infer the correct tree (as shown by simulation studies). Weaknesses Impossible to know that the model is correct, and different models may yield different answers omputationally intensive (most data sets not fully analyzable) haracters to use in phylogeny Morphology N sequence 8 haracters to use in phylogeny What are the desirable qualities of characters used for phylogeny reconstruction? 1. 2. he problem of homology with N the good, the bad and the ugly lignment (= HOMOLOY assessment) can be very challenging! axon 1 axon 2. 4. axon 1 axon 2 How are these qualities met by N sequence data? 9 40

he problems of locus choice: etting the right rate of evolution oo slow? not enough variation axon 1 axon 2 axon Example of insufficient evidence: metazoan phylogeny Metazoans Fungi Polytomy 41 42 hallenges: sunflower phylogeny Recent radiation (200,000 years) Many species, much hybridization Need more rapidly evolving markers!! = 15 spp! = 12 spp! he problems of locus choice: etting the right rate of evolution oo fast? homoplasy likely saturation only 4 possible states for N axon 1 axon 2 axon Polytomy 4 44

Saturation: mammalian mitochondrial N Saturation Imagine changing one nucleotide every hour to a random nucleotide Split the ancestral population in 2. his line is what we would expect if we had an infinite number of bases, so that every mutation could be seen. One hour Red indicates multiple mutations at a site Four hours 8 hours 12 hours 24 hours? 45 Phylogeny case study I: whales re whales ungulates (hoofed mammals)? Figure 4.8 Forces of evolution and phylogeny reconstruction How does each force affect the ability to reconstruct phylogeny? mutation? drift? selection? non-random mating? migration? 4 46 48

Whales: N sequence data Hillis,.. 1999. How reliable is this tree? ootstrapping. 49 How consistent are the data? ake the dataset (5 taxa, 10 characters) Orang reate a new data set by sampling characters at random, with replacement axon Human himp onobo orilla axon Human himp onobo orilla Orang 1 2 8 2 4 6 5 10 6 8 10 5 9 8 10 8 50 Whales: N sequence data Molecular clocks Hillis,.. 1999. 51 52

asic idea of molecular clocks hallenges for phylogeny: gene flow chimps 6 substitutions humans whales 60 substitutions 56 mya hippos 5 54 Sunflower annuals ifferent genes may have different histories! 55 Wayne Maddison (U) has emphasized that genes and species are not expected to always have the same evolutionary history. s such, gene trees and species trees will not always match each other, as shown in this diagram from the computer package MESQUIE (Maddison and Maddison) developed to tackled some of these 56 complexities.

Phylogeny study questions 1) Explain in words the difference between monophyletic, paraphyletic, and polyphyletic groups. raw a hypothetical phylogeny representing each type. ive an actual example of a commonly recognized paraphyletic taxon in both animals and in plants (use your text for sources). 2) How can a phylogenetic tree be used to determine if a similar character in two taxa is due to homoplasy? ) Whales are classified as cetaceans, not artiodactyl ungulates. his makes artiodactyls paraphyletic why? What is the evidence that whales belong in the artiodactyls? 4) Phenetics (distance methods) and cladistics (parsimony) differ in the ways they recognize and use similarities among taxa to form phylogenetic groupings. What types of similarity does each school recognize, and how useful is each type of similarity considered to be for identifying groups? Phylogeny study questions 5) What is bootstrapping in the context of phylogenetic analysis, and why is this procedure performed? 6) Why are maximum likelihood methods increasing in popularity for reconstructing phylogenies? In your answer, include a short description of how this method identifies the best phylogeny. ) Integrative question: raw a pair of axes with ime since divergence on the x axis and percent of sites that are the same on the y axis. raw a line that shows the expected pattern for third codon sites in protein coding genes: is your graph linear? Explain why or why not. How and why would the graph of first codon positions differ from this? 8) You are studying a group of species that lives in two very different environments. You build two phylogenies: one is based on a locus that is probably under divergent selection in the two environments, while the other phylogeny is based on a neutral locus. Which phylogeny would be more likely to represent the species history? Why? 5 58 Phylogeny study questions 9) For a number of years, nolis lizards are found in similar microhabitats on many separate islands in the aribbean that are very similar to each other (for example, large lizards that feed on the ground, smaller lizards that feed on tree trunks, and very small lizards that feed at the tops of branches). wo different, historical explanations have been proposed to explain this pattern: each morph has evolved repeatedly on each island, or each morph has evolved just once, then dispersed. Sketch a phylogeny that would support each hypothesis. 10) Integrative question: the ameroon lake cichlid phylogeny, showing that the lake species were monophyletic, was based on mitochondrial N. Explain why this might not reflect the species history. How could you be more certain about the phylogeny? 59