Inferring Molecular Phylogeny

Similar documents
Theory of Evolution. Charles Darwin

Theory of Evolution Charles Darwin

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Constructing Evolutionary/Phylogenetic Trees

Dr. Amira A. AL-Hosary

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Basic Terminology. Looking at Trees. Basic Terminology.

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

BINF6201/8201. Molecular phylogenetic methods

Constructing Evolutionary/Phylogenetic Trees

Phylogenetic inference

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogeny Tree Algorithms

Phylogenetic inference: from sequences to trees

The use of molecular tools for taxonomic research in zoology & botany

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Thanks to Paul Lewis, Jeff Thorne, and Joe Felsenstein for the use of slides


Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Intraspecific gene genealogies: trees grafting into networks

Phylogenetic Tree Reconstruction

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

PHYLOGENY AND SYSTEMATICS

Is the equal branch length model a parsimony model?

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

Algorithms in Bioinformatics

EVOLUTIONARY DISTANCES

Introduction to characters and parsimony analysis

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses

Inferring Molecular Phylogeny

A (short) introduction to phylogenetics

Multiple Sequence Alignment. Sequences

Lecture 6 Phylogenetic Inference

Phylogenetics. BIOL 7711 Computational Bioscience

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogeny. November 7, 2017

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Chapter 26 Phylogeny and the Tree of Life

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

Phylogeny. Properties of Trees. Properties of Trees. Trees represent the order of branching only. Phylogeny: Taxon: a unit of classification

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Seuqence Analysis '17--lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

8/23/2014. Phylogeny and the Tree of Life

A Phylogenetic Network Construction due to Constrained Recombination

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

The practice of naming and classifying organisms is called taxonomy.

What is Phylogenetics

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Classification and Phylogeny

How to read and make phylogenetic trees Zuzana Starostová

Phylogenetics Todd Vision Spring Some applications. Uncultured microbial diversity

Gel Electrophoresis. 10/28/0310/21/2003 CAP/CGS 5991 Lecture 10Lecture 9 1

C.DARWIN ( )

Evolutionary Tree Analysis. Overview

Classification and Phylogeny

Introduction to Bioinformatics Introduction to Bioinformatics

Phylogenetics in the Age of Genomics: Prospects and Challenges

How should we organize the diversity of animal life?

Curriculum Links. AQA GCE Biology. AS level

Multiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Sequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment

BIOINFORMATICS: An Introduction

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them?

a,bD (modules 1 and 10 are required)

Copyright notice. Molecular Phylogeny and Evolution. Goals of the lecture. Introduction. Introduction. December 15, 2008

Phylogenetic methods in molecular systematics

First generation sequencing and pairwise alignment (High-tech, not high throughput) Analysis of Biological Sequences

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Tools and Algorithms in Bioinformatics

Multiple Sequence Alignment

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Molecular evolution. Joe Felsenstein. GENOME 453, Autumn Molecular evolution p.1/49

Phylogenetic Analysis

Phylogenetic Analysis

Phylogenetic Analysis

PHYLOGENY & THE TREE OF LIFE

Phylogenetic analyses. Kirsi Kostamo

Chapter 16: Reconstructing and Using Phylogenies

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics

Bootstraps and testing trees. Alog-likelihoodcurveanditsconfidenceinterval

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Sequencing alignment Ameer Effat M. Elfarash

Consistency Index (CI)

Molecular Evolution, course # Final Exam, May 3, 2006

Transcription:

r. Walter Salzburger The tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 2 1. Molecular Markers

Inferring Molecular Phylogeny 3 Immunological comparisons! Nuttall & Uhlenhuth (early 20th century): blood relationships between species! cross reaction between sera and anti-sera! strategy: the degree of similarity reflects the strength of the evolutionary relationships Inferring Molecular Phylogeny 4 vise (1994)

Inferring Molecular Phylogeny 5 Protein electrophoresis! developed by Hunter & Markert (1957)! non-denatured proteins with different net charges migrate at different rates through starch or acrylamide gels! histochemical stains specific for enzymes under assay! zymograms are interpretable in terms of Mendelian genotypes Inferring Molecular Phylogeny 6

Inferring Molecular Phylogeny 7 vise (1994) Inferring Molecular Phylogeny 8 Restriction endonucleases! Linn & rber*, Meselson & Yuan (1968): discovery of restriction endonucleases (enzymes) i.e., precise scalpels to cut double-stranded N at specific motifs *Nobel Prize in Physiology or Medicine 1978

Inferring Molecular Phylogeny 9 Inferring Molecular Phylogeny 10 Recombinant N technology

Inferring Molecular Phylogeny 11 Restriction digestion! restriction digestion profiles! phylogenetic, population genetic markers! presence/absence matrix! used for: mitochondria, plastids, whole genomes, after PR amplification, etc. Inferring Molecular Phylogeny 12 Restriction fragment length polymorphism

Inferring Molecular Phylogeny 13 Inferring Molecular Phylogeny 14 mplified fragment length polymorphism

Inferring Molecular Phylogeny 15 N-N hybridization!...relies on the double-stranded nature of N!...and that complementary strands are held together by hydrogen bonds! when N is heated, it melts into single strands, if it is cooled, it re-associates! data: thermal elusion profiles Inferring Molecular Phylogeny 16 vise (1994)

Inferring Molecular Phylogeny 17 issociation curves! homoduplex hybridization (intraspecific)! heteroduplex hybridization (interspecific)! genetic distance ~ "Thomo - "Thetero heteroduplex ostrich-rhea vise (1994) homoduplex ostrich Inferring Molecular Phylogeny 18 Sibley & hlquist (1980s) Sibley & hlquist (1990)

Inferring Molecular Phylogeny 19 N Sequencing! Walter ilbert and Fred Sanger develop techniques for N sequencing (1977) Walter ilbert (1932-) Fred Sanger (1918-) Nobel Prize in hemistry 1980 Inferring Molecular Phylogeny 20 Polymerase hain Reaction (PR)! In the 1980s, Kary. Mullis invents and helps to develop further the PR Kary. Mullis (1944-) Nobel Prize in hemistry 1993

Inferring Molecular Phylogeny 21 2. Inferring Phylogenies ocuments of volutionary History 22 space...ttt......ttt......ttt......ttt......ttt......ttt......ttt......ttt......ttt......ttt......ttt......ttt......ttt......ttt......tttt......tttt......tttt......tttt......tttt......tttt... time...tt......tt......tt......tt......tttt......tttt......tttt......tttt......tttt......tttt......tttt......tttt...

ocuments of volutionary History 23...TTT......TT......TTT......TTTT......TTTT......TTTT... time Inferring Molecular Phylogeny 24 TTT TTT TTTT TTTT raw N sequences alignment* TTT TTT TTT--T TTT--T gap aligned N sequences *lignment: inferring homology at the N sequence level

Inferring Molecular Phylogeny 25 TTT TTT phylogeny reconstruction TTT--T TTT--T molecular phylogeny aligned N sequences Inferring Molecular Phylogeny 26 2.1. Sequence lignment

Inferring Molecular Phylogeny 27 Homology!...similarity between characters due to shared ancestry! Two nucleotides in different sequences are homologous if (and only if) the sequences both acquired that state from their common ancestor homologous character homoplasious character Inferring Molecular Phylogeny 28 lignment!...set of homologous sequences in which every nucleotide position is homologous! to align : inferring homology at the sequence level HO1! 1! K1! R1!! TTTT! TTTT! TTTT! TTTT HO1! 1! K1! R1!! TTTT! TTTT! TT-TT! TT-TT gap

Inferring Molecular Phylogeny 29 Pairwise alignment I: gaps Sequence1 TTTT Sequence2 TTT 1 Sequence1 TTTT Sequence2! TTT 2 Sequence1 T---TTT Sequence2 TTT 2 1 T!!!!!!!!! T!!!!!!!!!!!! T!!!!! T T T T dot plot Inferring Molecular Phylogeny 30 Pairwise alignment II: sequences that differ Sequence1 TTT Sequence2 TTT 1 Sequence1 TTT Sequence2! TTT 1 T!!!!!! T!!!!!!!!!! T!!!! T T T

Inferring Molecular Phylogeny 31 Pairwise alignment III: the cost of an alignment number of substitutions cost = s + wg total length of gaps gap penalty w = 1 : gap is as expensive as a substitution w = 2 : gap is twice as expensive as a substitution Inferring Molecular Phylogeny 32 Pairwise alignment IV: evaluating alternatives Sequence1 TTTT Sequence2 TT 1 Sequence1 TTTT Sequence2 T-T 2 Sequence1 T--TTT Sequence2! TT 2 1!! T!!!!!!!!!!!!!!!! T!!!!! T T T T dot plot

Inferring Molecular Phylogeny 33 Pairwise alignment IV: evaluating alternatives Sequence1 TTTT Sequence2 TT = s + wg 1 Sequence1 TTTT Sequence2 T-T 2 Sequence1 T--TTT Sequence2! TT (w=1) = 2 + 1 x 1 = 3 (w=3) = 2 + 3 x 1 = 5 (w=1) = 0 + 1 x 2 = 2 (w=3) = 0 + 3 x 2 = 6 Inferring Molecular Phylogeny 34 LST: asic Local lignment Search Tool! http://www.ncbi.nlm.nih.gov/lst/!...finds regions of local similarity between nucleotide or protein sequences and calculates the statistical significance of matches!...uses databases in enank*!...see official NI handbook, chapter 16 (on course web page) *nucleotide and protein database of the National enter for iotechnolgy Information

Inferring Molecular Phylogeny 35 Multiple alignment! sum-of-pairs: minimizing the costs of all pairwise alignments (e.g., computer program lustalw)! tree alignment: uses phylogenetic information! star alignment: all sequences are equally related! tree alignment: phylogenetic relationships between sequences are taken into account Inferring Molecular Phylogeny 36 Protein alignments: PM and LOSUM matrices S T P N Q H R K M I L V F Y W 9 ysteine S -1 4 T -1 1 5 P -3-1 -1 7 Hydrophilic 0 1 0-1 4-3 0-2 -2 0 6 N -3 1 0-2 -2 0 6-3 0-1 -1-2 -1 1 6 cid- -4 0-1 -1-1 -2 0 2 5 amide Q -3 0-1 -1-1 -2 0 0 2 5 H -3-1 -2-2 -2-2 1-1 0 0 8 R -3-1 -2-2 -1-2 0-2 0 1 0 5 asic K -3 0-1 -1-1 -2 0-1 1 1-1 2 5 M -1-1 -2-2 -1-3 -2-3 -2 0-2 -1-1 5 I -1-2 -3-3 -1-4 -3-3 -3-3 -3-3 -3 1 4 Hydro- L -1-2 -3-3 -1-4 -3-4 -3-2 -3-2 -2 2 2 4 phobic V -1-2 -2-2 0-3 -3-3 -2-2 -3-3 -2 1 3 1 4 F -2-2 -4-4 -2-3 -3-3 -3-3 -1-3 -3 0 0 0-1 6 Y -2-2 -3-3 -2-3 -2-3 -2-1 2-2 -2-1 -1-2 -1 3 7 romatic M -2-3 -4-4 -3-2 -4-4 -3-2 -2-3 -3-1 -2-1 -3 1 2 11 LOSUM62 PM... Position ccepted Mutation LOSUM... Locks SUbstitution Matrix

Inferring Molecular Phylogeny 37 2.2. Phylogenetic Methods! istance Methods! UPM! Neighbor joining! Minimum volution " Maximum Parsimony " Maximum Likelihood " ML! ayesian Inference genetic distance Seq Seq Seq Seq 3 - - Seq 5 4 - Seq 5 4 2 nucleotide sequence 1 2 3 4 5 6 7 Seq T T T T Seq T T T Seq T Seq T Inferring Molecular Phylogeny 38 iscrete character vs. distance method 2 1 2 1 1 genetic distance Seq Seq Seq Seq 3 - - Seq 5 4 - Seq 5 4 2 1 2 T> T> 3 4 5 6 T>T> T> >T 7 >T nucleotide sequence 1 2 3 4 5 6 7 Seq T T T T Seq T T T Seq T Seq T

luster methods: step-by-step approach starting tree 1 add next sequence Round 1 Round 2 starting tree 2 place next sequence add next sequence? place next sequence? Inferring Molecular Phylogeny 39 Optimally criterion: choose among all possible trees 6 5 4 7 2 3 1 4 5 3 2 3 2 4 7 Inferring Molecular Phylogeny 40

Inferring Molecular Phylogeny 41 Optimally criterion: too many trees problem... number of taxa number of trees (unrooted) number of trees (rooted) 2 1 1 3 1 3 4 3 15 5 15 105 6 105 945 7 945 10395 8 10395 135135 9 135135 2027025 10 2027025 34459425 Inferring Molecular Phylogeny 42 Type of data istances Nucleotides Tree building method lustering algorithm Optimally criterion UPM Neighbor joining Minimum volution Maximum Parsimony Maximum Likelihood

Inferring Molecular Phylogeny 43 UPM: unpaired group method with arithmetic means distance matrix - 2-6 6-10 10 10 - Sequence TTT Sequence TT 1 1 1 0 0 1 1 1 1 1 haracter Taxon Taxon pigment 1 0 fins 1 1 eyes 1 1 teeth 0 1 Inferring Molecular Phylogeny 44 UPM: unpaired group method with arithmetic means 1 distance matrix 2 1 2 2-2 - 6 6-10 10 10-5 3 6 6 10 10 10 5 4 3 2 1 0 ultrametric tree

Inferring Molecular Phylogeny 45 Neighbor joining (NJ) 5 1 distance matrix 6 1 2-6 - 7 3-14 10 9-1 6 7 3 14 9 10 1 additive tree Inferring Molecular Phylogeny 46 Minimum evolution (M) total number of branches in a tree of n sequences 2n-3 tree length L =!ei i=1 individual branch length! The minimum evolution tree is the one that minimizes L

Inferring Molecular Phylogeny 47 Minimum evolution (M): example Human himp orilla Orang-utan ibbon Human 79 92 144 162 himp 79 95 154 169 orilla 92 102 150 169 Orang-utan 144 154 150 169 ibbon 163 173 169 169 pairwise distances between hominoid sequences observed calculated Inferring Molecular Phylogeny 48 Minimum evolution (M): example ibbon Orang-utan 75 94 orilla 49 26 Human 8.5 34.5 44.5 himpanzee

Inferring Molecular Phylogeny 49 Maximum Parsimony (MP)! Maximum parsimony principle: preference for the least complex explanation for an observation! in phylogenetics: choosing the tree that requires the fewest evolutionary changes ( most parsimonious tree )...! i.e., the tree that requires the fewest mutational steps...! i.e., the tree with the shortest tree length Inferring Molecular Phylogeny 50 Maximum Parsimony (MP) number of nucleotide sites k tree length L =!li i=1 tree length for an individual site! The most parsimonious tree is the one that minimizes L

Inferring Molecular Phylogeny 51 Maximum Parsimony (MP) alignment Site 1 2 3 4 5 Taxon T T T Taxon T T Taxon T Taxon T Inferring Molecular Phylogeny 52 Maximum Parsimony (MP) alignment Site 1 2 3 4 5 Taxon T T T Taxon T T Taxon T Taxon T Site 1 change 1 change 2 changes 2 changes

Inferring Molecular Phylogeny 53 Maximum Parsimony (MP) alignment Site 1 2 3 4 5 Taxon T T T Taxon T T Taxon T Taxon T Site 1 1 change T T Site 2 1 change Site 3 Site 4 T Site 5 T 1 change T 1 change T 0 change T Inferring Molecular Phylogeny 54 Maximum Parsimony (MP) Sites Tree 1 2 3 4 5 Total ((,),(,)) 1 1 2 1 0 5 ((,),(,)) 2 2 1 1 0 6 ((,),(,)) 2 2 2 1 0 7