Phylogeny. Information. ARB-Workshop 14/ CEH Oxford. Molecular Markers. Phylogeny The Backbone of Biology. Why? Zuckerkandl and Pauling 1965
|
|
- Jordan Townsend
- 5 years ago
- Views:
Transcription
1 Frank Oliver löckner Information Phylogeny Who are we: Dr. Frank Oliver löckner Dr. Jörg Peplies Max Planck Institute for Marine Microbiology Microbial enomics roup Bremen, ermany ontact: Mailinglist: RBWorkshop 4/ EH Oxford Where can you find additional information: ftp.mpibremen.de/molecol_p/arb > all files needed to install RB can be found in the EH_Oxford folder Frank Oliver löckner Phylogeny he Backbone of Biology Why? o track back the origin of organisms o unravel evolutionary relationships o sort and classify organisms Molecular Markers Zuckerkandl and Pauling 965 Use macromolecules as molecular clocks DN/RN Proteins How? Botany and Zoology Morphology Fossils Microbiology Molecular markers Problem: Species vs. enephylogeny Lateral gene transfer enome plasticity/patchwork Orthologous/paralogous genes Frank Oliver löckner Frank Oliver löckner 4 Universal ree Homology Definition Homology wo sequences are homolog when they evolved from a common ancestor sequence Homology can not be quantified! Sequences are homolog or not! Orthologous genes Direct common ancestor Paralogous genes Originates from a gene duplication Doolittle, Science 999, 84:48 Frank Oliver löckner 5 Frank Oliver löckner 6
2 Frank Oliver löckner 7 Orthologs/Paralogs species Phylogenetic Markers geneduplication speciation speciation 4 4 species B species species B 6S rrn S rrn Elongationfactors EFu EF PSynthase Reg Hsp60 RNPolymerase yrase Housekeeping enes ranscription ranslation Frank Oliver löckner 8 rrn as Phylogenetic Marker dvantages Functional constancy Ubiquitous distribution Large size (information content) onserved and highly variable structural elements No lateral gene transfer Drawbacks No continuous sequence change Multiple genes/operons Different species with identical 6S rrns One base change needs nearly one million years Steps in Phylogenetic nalysis Sequence determination lignment Data analysis Phylogenetic reconstruction Frank Oliver löckner 9 Frank Oliver löckner 0 Sequence Determination utomatic sequencers BI Prism 77 (gel) BI Prism 00 (6 capillary) BI Prism 700 (96 capillary) Megabase 500 (48 capillary) Megabase 000 (96 capillary) he European Database on Ribosomal RN Maintenance Department of Biochemistry, University ntwerpen Services SSU, LSU sequences, annotations lignments, secondary structures, variability maps WWW interface for sequence retrieval Software for alignment and tree reconstruction ontent (aligned sequences) Release September 00 0,85 SSU sequences,400 LSU sequences Frank Oliver löckner Frank Oliver löckner
3 Frank Oliver löckner RDPII Ribosomal Database Project RB Software Environment for Sequence Data Maintenance enter for Microbial Ecology, Michigan State University Services SSU, LSU sequences, annotations lignments Phylogenetic trees nalysis services via WWW server ontent (aligned sequences) RDP Preview Release from 05/05/004 97,8 SSU sequences 7 LSU sequences Maintenance Department for Microbiology, echnical University Munich Services SSU, LSU sequences, annotations lignments, Phylogenetic trees Probe design, Probe match Software suite RB ontent (aligned sequences) Prerelease July 04 59,609 SSU sequences 698 LSU sequences Frank Oliver löckner 4 lignment Problem Variable Regions lign the sequences in a way, that homologous bases will stand one below the other in a column Frank Oliver löckner 5 Frank Oliver löckner 6 he 6S Secondary Structure RBEdit proteins 6S rrn 0S subunits 70S ribosome 50S 4 proteins 5S rrn S rrn Escherichia coli 6S rrn primary and secondarystructure Frank Oliver löckner 7 Frank Oliver löckner 8
4 Frank Oliver löckner 9 Secondary Structures Secondary Structures UUUUUUUU UUUUUUUUU UUUUUU Escherichia coli Secondary structure information UUUUUUUU UUUUUUUUU UUUUUU Mycoplasma hypopneumoniae Streptococcus oralis Frank Oliver löckner 0 SSU Secondary structure Data nalysis Information ontent Size (E.coli) Information (bits) Similarity 6S rrn 54 n 084 >67% S rrn 904 n 5808 >67% EFu 94 aa 706 >60% Pase β subunit 460 aa 99 >6% onserved Variable Information (variable) Information (real) Ludwig and Klenk, Bergeys Frank Oliver löckner Frank Oliver löckner Information ontent 6S rrn haracters No % Phylogeny, 5 Oxford 004 S rrn EFu Pase βsub. No % No % No % Oliver Frank löckner Models of Evolution Models of substitution rates between bases ransition >, >, >, > ransversion >, >, >, > and reverse minoacids: PM and BLOSUM matrices Base frequencies Models of amongsite substitution rate heterogeneity Weighting particular sites according to relative mutation frequencies (position variability) Frank Oliver löckner 4
5 Frank Oliver löckner 5 Models of Evolution Jukesantor model ll substitution types and base frequencies are presumed equal ime reversible Kimura parameter model ransitions are more likely than transversions Equal base frequencies ime reversible Substitution Models reeing methods Maximum Parsimony Fixed costs matrices Distance Matrix and Maximum Likelihood eneral model of sequence evolution Not addressed Lineagespecific substitutions Different rates of evolution between lineages (a+a+a) a4 a7 a0 a (a4+a5+a6) a8 a a a5 (a7+a8+a9) a a a6 a9 (a0+a+a) a = relative rate between the different substitutions x frequency of target base Frank Oliver löckner Swofford, Book (Hillis), 996, p. 4 6 Models of Evolution eneral matrix Models of Evolution eneral ime Reversal (R) Jukes antor Kimura s parameter model (KP) Frank Oliver löckner Swofford, Book (Hillis), 996, p. 4 7 Phylogeny, Oxford Swofford, 004 Book (Hillis), 996, p. 44 Frank Oliver löckner 8 reeing Methods lassification Inferring a phylogeny is really an estimation procedure; we are making a best estimate of an evolutionary history based on incomplete information Swofford, 990 Distancebased ompute pairwise distances and use them to derive the tree haracterbased Work directly on each character of the data. Derive trees that optimize the distribution of the actual data pattern for each character Maximum Parsimony, Maximum Likelihood lgorithmbased enerate a tree according to a series of steps (e.g. neighbor joining) riterionbased Evaluation of alternative trees according to some optimization functions Frank Oliver löckner 9 Frank Oliver löckner 0
6 Frank Oliver löckner he Most ommon Methods for ree Reconstruction Distance Matrix alculation of distance matrices by binary comparison of the aligned sequences UPM or Neighbor Joining Maximum Parsimony Preservation is more likely than change Search for topologies that minimize the total tree length assuming a minimum number of base changes Maximum Likelihood Searches for the evolutionary model, including the tree itself, that has the highest likelihood of producing the observed data Models: transition/transversion; base frequencies; positional variability Definitions peripheral branch internal branch Radial tree central branch terminal nodes/tips links/edges internal nodes Dendrogram Unrooted tree: the location of the common ancestor is not specified Frank Oliver löckner Distance Matrix Ultrametric Data Distance Matrix Non Ultrametric Data UPM Unweighted Pair roup Method with rithmetic Mean Frank Oliver löckner Frank Oliver löckner 4 Distance Matrix dditive rees Example additive trees: FitchMargoliash algorithm alculate the matrix Find the most closely related pair of sequences and link it by an internal node Link the next related sequence with an internal node alculate branch length B 9 B E Frank Oliver löckner 5 to B = a+b = () to = a+c = 9 () B to = b+c = 4 () Subtract () from (), 94 = (4) dd () and (4), = 0, a = 0 From () and (), b =, c = 9 B Frank Oliver löckner 6 a b c Mount, Book, Bioinformatics 00, p. 57
7 Frank Oliver löckner 7 Principle of Neighbor Joining (Saitou and Nei, 987) he fully resolved tree is decomposed from a fully unresolved star tree by successively inserting branches between a pair of closest neighbors and the remaining terminals in the tree B H D F E star decomposition B H D F E Distance Matrix dditive rees Finding a tree that fits to the matrix Find the optimal values for the branching pattern and the branch length NJ will find the correct tree if the distances are additive Problem: Nonadditive distances caused by superimposed changes Observed distance Real distance Frank Oliver löckner 8 Dealing with nonadditive distances Instead of using raw dissimilarity correct distances based on expected numbers of hidden changes For some models (J, KP, F84) simple distance equations exist For others one must use ML Outcome: dditivity is not restored!! > Optimality criterion is needed Most widely used = leastsquares criterion (e.g., Fitch Margoliash) can lead to negative branch length Minimal Evolution (PUP) Distance Matrix orrect for Multiple hanges Jukes and antor, 969 Frank Oliver löckner 9 Frank Oliver löckner 40 Pros and ons Very fast Only one tree is derived opology and branch lengths are calculated ounts for false identities Works with different models of evolution Discards the primary character data Different sequences can yield the same matrix distance method would reconstruct the true tree if all genetic divergence events were accurately recorded in the sequence Swofford, 996 Maximum Parsimony MP is an optimality criterion that appeals to the principle: he simplest explanation of the data is the best Model of evolution: Preservation is more likely than change haracter based method Evaluates trees Selects trees that minimize the total tree length Needs a set of outgroup taxa alculations are done from the terminal nodes towards the (arbitrary) root Implicit model of evolution no additional model needed Frank Oliver löckner 4 Frank Oliver löckner 4
8 Frank Oliver löckner 4 Maximum Parsimony Evaluation of rees he alignment is checked for informative positions o be informative, a site must have the same sequence characters in at least two taxa (e.g. site,,, 5) nd they must favor one topology over another (only site 5) Only the informative sites are analyzed S / S S S4 / S S S S4 S / S4 S S / S S S S4 4 5 / mutations mutation / mutations / Frank Oliver löckner 44 Pros and ons Works directly on the data Works fine on data with strong similarity Relatively fast Does not need a model of evolution alculates only topologies Performs weakly on distantly related data Prone to false identities (multiple changes) long branch attraction an produce many trees with the same parsimony score Maximum Likelihood ML evaluates a hypothesis about evolutionary history in terms of probability that a proposed model of the evolutionary process and the hypothesized history would give rise to the observed data haracter based method oncrete model of evolution needed ssumes that nucleotide sites evolve independently Likelihood for each site is calculated separately and combined to a total value for a tree Looks for the tree with the highest likelihood; L () = maximal Frank Oliver löckner 45 Frank Oliver löckner 46 Maximum Likelihood Maximum Likelihood he likelihood of the full tree is the product of the likelihood at each site L () = L () x L () x. x L (N) = N j = L(j) Because the probability of any single observation is an extremely small number they are normally handled as logarithms For every internal node all four nucleotides are allowed > 4x4 = 6 probabilities Each probability is the product of the probability of the base in (6) and the transition/transversion probabilities e.g. prob. = 0.5 or average frequency of in the sequence (> depends on model) > transversion = 0 6 and > transition = x0 6 Likelihood of = 0.5 x x0 6 x 0 6 = 5x0 ln L () = ln L () + ln L () +. + ln L (N) = N j = lnl( j ) Frank Oliver löckner Swofford, Book (Hillis), 996, p Frank Oliver löckner 48
9 Frank Oliver löckner 49 Pros and ons MP vs. ML Works directly on the data Performs well also on distantly related data Includes models of evolutions he whole tree is under evaluation topologies and branch lengths are optimized urrently regarded as the best method omputationally intense number of sequences is limited Frank Oliver löckner Swofford, Book (Hillis), 996, p Searching for optimal trees Exact lgorithms Exhaustive search Branchandbound Methods How many trees do we have to evaluate Places to add another taxon wo taxa = Heuristic pproaches Stepwise addition Star decomposition Branch swapping hree taxa Four taxa = = 5 Five taxa = 7 Frank Oliver löckner 5 Frank Oliver löckner 5 he 5 possible unrooted trees for 5 taxa Exhaustive opologies Number of unrooted, bifurcating trees No of sequences No of trees ,95 5,5,07,05.8x0 74 B( ) = i= (i 5) he root is just another taxon so: No of sequences 4 No of trees 5 Frank Oliver löckner Swofford, Book (Hillis), 996, p Frank Oliver löckner 54
10 Frank Oliver löckner 55 Exact algorithms Search tree for BranchandBound Exhaustive (< taxa) ll trees are evaluated Branchandbound (<0 taxa) onstruct a random tree with all sequences and evaluate its value L under the chosen optimality criterion according to the reconstruction method and model used his is the initial upper bound of L Start to reconstruct trees from to X taxon by stepwise addition of taxa Evaluate each tree if the score exceeds L there is no need to go further along this path, if the score < L proceed If the score at the end of the path is less than L take this for the new upper bound Frank Oliver löckner Swofford, Book (Hillis), 996, p Heuristic pproaches lobal vs. Local Optimum lobal vs. local optimum Heuristic tree searches generally operate by hill climbing methods Start with an initial tree Optimize (rearrange) it under the chosen optimality criterion If we find no way for further improvement stop Problem: here is no way of knowing if we reached the global or merely a local optimum Frank Oliver löckner 57 Frank Oliver löckner 58 lobal vs. Local Optimum Heuristics Stepwise ddition Stepwise ddition Start with three sequences dd next taxon evaluate tree, do rearrangements Save the one with the best score add next taxon ddition order In the order of the data in the alignment Use a distance algorithm to decide order e.g. by closest taxon addition dd the taxon that makes the optimal e.g. shortest tree Random taxon addition order Frank Oliver löckner 59 Frank Oliver löckner 60
11 Frank Oliver löckner 6 Heuristics Branch swapping Heuristics NNI Branch Swapping Nearest Neighbor Interchange (NNI) Subtree pruning and recrafting (SPR) ree bisection (BR) Hoping to find a better tree by disturbing (rearranging) the tree to overcome local optima Problem: If the tree is on a plateau and the global optimum several steps away we might still not reach it Frank Oliver löckner Felsenstein, Book, 004, p. 9 6 Heuristics SPR Heuristics BR Frank Oliver löckner Felsenstein, Book, 004, p. 4 6 Frank Oliver löckner Felsenstein, Book, 004, p onfidence ests Bootstrapping Bootstrapping Resampling tree evaluation technique New data sets are created from the original data set by sampling columns of characters by random with replacement Each site can be sampled again with the same probability as any of the other sites Problem: Some positions can be over represented, some sites are missing t least 00, better,000 trees should be calculated Remember: High bootstrap values can make wrong phylogeny look good!! * * * * * * * * * *** ********* ** * *** *** * **** ***** **** **** consensus tngccatctttcacgnaacanncnctngcngaca HI attgcagtgtattggggacaaaatggaaatgaagggtctttgcaagatgc PSHI atagctgtttactggggccaaaacggtggagaaggatccttagcagacac NIDL atagtaatatattggggccaaaatgggaatgaaggtagcttagctgacac S6608 attgtcatatactggggccaaaatggtgatgaaggaagtcttgctgacac USSEQ_ atcgccatctattggggccaaaacggcaacgaaggctctcttgcatccac USSEQ_ atcgccatctattggggtcaaaacggcaacgagggctctcttgcatccac USSEQ_ atcggcatctattggggccaaaacggcaacgaaggctctcttgcatccac VIRE atttccgtctactggggtcaaaacggtaacgagggctccctggccgacgc VURNH auuuccgucuacuggggucaaaacggcaacgagggcucucuggccgacgc HHI atagccatctattggggccaaaacggaaacgaaggtaacctctctgccac VURNHB auagccaucuacuggggccaaaacggcaacgagggaacgcuuuccgaagc NBSIL attgtagtctattggggccaagatgtaggagaaggtaaattgattgacac Frank Oliver löckner 65 Frank Oliver löckner 66
12 Frank Oliver löckner 67 Bootstrapping Why do trees differ? Information content Sequencing errors lignment homology of characters Nonadditive data (false identities) Different and simplified models of evolution Independence of data Lineage and/or positionspecific rate of evolution Data selection Only subsets of organisms and positions alculation heuristics Small amount of evaluated trees strong dependence on the order of input data Local or global optimum? Frank Oliver löckner 68 onsensus trees Practical implications Filters Filters: Remove or weight down individual alignment columns while treeing Keep balance between data loss and gain of accuracy E.g. 50% conservation filter olumns in the alignment are only considered for tree reconstruction, when at least 50% of the sequences show the same residue Position variability he position variability for every column is calculated and shown as numbers 9 and characters Z means highly variable Z means extremely conserved (never seen) Frank Oliver löckner 69 Frank Oliver löckner 70 RB Filter Practical implications Outgroup hose as many sequences for the outgroup as possible hey should not be too far related to the group of interest Pic RB Phylo Data Use always the largest dataset available if necessary remove sequences after the calculation ompare different algorithms Reject problematic data Never reconstruct trees or filters on partial sequence data Frank Oliver löckner 7 Frank Oliver löckner 7
13 Frank Oliver löckner 7 RB Internal rchitecture Probefunctions Database Databasemanagement he concept of RB Probe_Design Probe_Match request update request lignment possible probes matching sequences next relative PServer Sequencealignment Phylogenetic reconstructions Frank Oliver löckner 74 PServer Do not overdo it Not delivered with RB Different format of your database for faster performance of sequence search functions within RB It is only used to search the next relative for the automatic aligner and for Probe_Design/Probe_Match reating/updating takes a long time and a lot of memory Once it has been created searching is very fast Frank Oliver löckner 75 Frank Oliver löckner 76
9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)
I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by
More informationTree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More informationPhylogeny Tree Algorithms
Phylogeny Tree lgorithms Jianlin heng, PhD School of Electrical Engineering and omputer Science University of entral Florida 2006 Free for academic use. opyright @ Jianlin heng & original sources for some
More informationAmira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationDr. Amira A. AL-Hosary
Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological
More informationPhylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline
Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationPhylogenetic inference
Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types
More informationTheory of Evolution. Charles Darwin
Theory of Evolution harles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (8-6) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationPhylogeny. Properties of Trees. Properties of Trees. Trees represent the order of branching only. Phylogeny: Taxon: a unit of classification
Multiple sequence alignment global local Evolutionary tree reconstruction Pairwise sequence alignment (global and local) Substitution matrices Gene Finding Protein structure prediction N structure prediction
More informationEvolutionary Tree Analysis. Overview
CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based
More informationBioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony
ioinformatics -- lecture 9 Phylogenetic trees istance-based tree building Parsimony (,(,(,))) rees can be represented in "parenthesis notation". Each set of parentheses represents a branch-point (bifurcation),
More informationBioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics
Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods
More informationInferring Molecular Phylogeny
Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction
More informationHow to read and make phylogenetic trees Zuzana Starostová
How to read and make phylogenetic trees Zuzana Starostová How to make phylogenetic trees? Workflow: obtain DNA sequence quality check sequence alignment calculating genetic distances phylogeny estimation
More informationTheory of Evolution Charles Darwin
Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationC3020 Molecular Evolution. Exercises #3: Phylogenetics
C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from
More informationPhylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center
Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods
More informationPhylogenetics. BIOL 7711 Computational Bioscience
Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationPage 1. Evolutionary Trees. Why build evolutionary tree? Outline
Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istance-based vs. character-based methods. istance-based: Ultrametric Trees dditive Trees. haracter-based: Perfect phylogeny
More information"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky
MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally
More informationPhylogenetics: Building Phylogenetic Trees
1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should
More information(Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise
Bot 421/521 PHYLOGENETIC ANALYSIS I. Origins A. Hennig 1950 (German edition) Phylogenetic Systematics 1966 B. Zimmerman (Germany, 1930 s) C. Wagner (Michigan, 1920-2000) II. Characters and character states
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods
More informationThanks to Paul Lewis, Jeff Thorne, and Joe Felsenstein for the use of slides
hanks to Paul Lewis, Jeff horne, and Joe Felsenstein for the use of slides Hennigian logic reconstructs the tree if we know polarity of characters and there is no homoplasy UPM infers a tree from a distance
More informationSeuqence Analysis '17--lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony
Seuqence nalysis '17--lecture 10 Trees types of trees Newick notation UPGM Fitch Margoliash istance vs Parsimony Phyogenetic trees What is a phylogenetic tree? model of evolutionary relationships -- common
More informationPhylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction
More informationMolecular Evolution & Phylogenetics
Molecular Evolution & Phylogenetics Heuristics based on tree alterations, maximum likelihood, Bayesian methods, statistical confidence measures Jean-Baka Domelevo Entfellner Learning Objectives know basic
More informationMichael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D
7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood
More informationPhylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.
Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony
More informationPhylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University
Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationWhat is Phylogenetics
What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)
More informationEstimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057
Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationPhylogenetic analyses. Kirsi Kostamo
Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,
More informationBINF6201/8201. Molecular phylogenetic methods
BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationPhylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz
Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels
More informationLetter to the Editor. Department of Biology, Arizona State University
Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona
More informationPhylogenetics Todd Vision Spring Some applications. Uncultured microbial diversity
Phylogenetics Todd Vision Spring 2008 Tree basics Sequence alignment Inferring a phylogeny Neighbor joining Maximum parsimony Maximum likelihood Rooting trees and measuring confidence Software and file
More informationInferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution
Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods
More informationSequence Alignment (chapter 6)
Sequence lignment (chapter 6) he biological problem lobal alignment Local alignment Multiple alignment Introduction to bioinformatics, utumn 6 Background: comparative genomics Basic question in biology:
More informationBioinformatics tools for phylogeny and visualization. Yanbin Yin
Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and
More informationConsistency Index (CI)
Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)
More informationBackground: comparative genomics. Sequence similarity. Homologs. Similarity vs homology (2) Similarity vs homology. Sequence Alignment (chapter 6)
Sequence lignment (chapter ) he biological problem lobal alignment Local alignment Multiple alignment Background: comparative genomics Basic question in biology: what properties are shared among organisms?
More informationDNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi
DNA Phylogeny Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics :
More informationEffects of Gap Open and Gap Extension Penalties
Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See
More informationPhylogenetic trees 07/10/13
Phylogenetic trees 07/10/13 A tree is the only figure to occur in On the Origin of Species by Charles Darwin. It is a graphical representation of the evolutionary relationships among entities that share
More informationFinding the best tree by heuristic search
Chapter 4 Finding the best tree by heuristic search If we cannot find the best trees by examining all possible trees, we could imagine searching in the space of possible trees. In this chapter we will
More informationInDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9
Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic
More informationPrinciples of Phylogeny Reconstruction How do we reconstruct the tree of life? Basic Terminology. Looking at Trees. Basic Terminology.
Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Phylogeny: asic erminology Outline: erminology Phylogenetic tree: Methods Problems parsimony maximum likelihood bootstrapping
More informationPhylogeny. November 7, 2017
Phylogeny November 7, 2017 Phylogenetics Phylon = tribe/race, genetikos = relative to birth Phylogenetics: study of evolutionary relationships among organisms, sequences, or anything in between Related
More informationCS5263 Bioinformatics. Guest Lecture Part II Phylogenetics
CS5263 Bioinformatics Guest Lecture Part II Phylogenetics Up to now we have focused on finding similarities, now we start focusing on differences (dissimilarities leading to distance measures). Identifying
More informationConcepts and Methods in Molecular Divergence Time Estimation
Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics Lecture : p he biological problem p lobal alignment p Local alignment p Multiple alignment 6 Background: comparative genomics p Basic question in biology: what properties
More informationPhylogeny: building the tree of life
Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan
More information8/23/2014. Phylogeny and the Tree of Life
Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major
More informationPhylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?
Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species
More informationInferring Molecular Phylogeny
r. Walter Salzburger The tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 2 1. Molecular Markers Inferring Molecular Phylogeny 3 Immunological comparisons! Nuttall
More informationA phylogenetic view on RNA structure evolution
3 2 9 4 7 3 24 23 22 8 phylogenetic view on RN structure evolution 9 26 6 52 7 5 6 37 57 45 5 84 63 86 77 65 3 74 7 79 8 33 9 97 96 89 47 87 62 32 34 42 73 43 44 4 76 58 75 78 93 39 54 82 99 28 95 52 46
More informationChapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships
Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic
More informationConsensus Methods. * You are only responsible for the first two
Consensus Trees * consensus trees reconcile clades from different trees * consensus is a conservative estimate of phylogeny that emphasizes points of agreement * philosophy: agreement among data sets is
More informationBiology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29):
Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29): Statistical estimation of models of sequence evolution Phylogenetic inference using maximum likelihood:
More informationMidterm Exam #1. MB 451 Microbial Diversity. Honor pledge: I have neither given nor received unauthorized aid on this test.
Midterm xam #1 M 451 Microbial iversity Honor pledge: I have neither given nor received unauthorized aid on this test. Signed : ate : Feb 5, 2007 Name : KY 1. What are the three primary evolutionary branches
More informationNJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees
NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana
More informationUsing phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)
Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures
More informationPhylogenetic inference: from sequences to trees
W ESTFÄLISCHE W ESTFÄLISCHE W ILHELMS -U NIVERSITÄT NIVERSITÄT WILHELMS-U ÜNSTER MM ÜNSTER VOLUTIONARY FUNCTIONAL UNCTIONAL GENOMICS ENOMICS EVOLUTIONARY Bioinformatics 1 Phylogenetic inference: from sequences
More informationAdditive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.
Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then
More informationMultiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:
Multiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:50 5001 5 Multiple Sequence Alignment The first part of this exposition is based on the following sources, which are recommended reading:
More informationBootstrapping and Tree reliability. Biol4230 Tues, March 13, 2018 Bill Pearson Pinn 6-057
Bootstrapping and Tree reliability Biol4230 Tues, March 13, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Rooting trees (outgroups) Bootstrapping given a set of sequences sample positions randomly,
More informationC.DARWIN ( )
C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships
More informationX X (2) X Pr(X = x θ) (3)
Notes for 848 lecture 6: A ML basis for compatibility and parsimony Notation θ Θ (1) Θ is the space of all possible trees (and model parameters) θ is a point in the parameter space = a particular tree
More informationPhylogenetics: Parsimony
1 Phylogenetics: Parsimony COMP 571 Luay Nakhleh, Rice University he Problem 2 Input: Multiple alignment of a set S of sequences Output: ree leaf-labeled with S Assumptions Characters are mutually independent
More informationA Fitness Distance Correlation Measure for Evolutionary Trees
A Fitness Distance Correlation Measure for Evolutionary Trees Hyun Jung Park 1, and Tiffani L. Williams 2 1 Department of Computer Science, Rice University hp6@cs.rice.edu 2 Department of Computer Science
More informationUoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)
- Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the
More informationSequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University
Sequence Alignment: A General Overview COMP 571 - Fall 2010 Luay Nakhleh, Rice University Life through Evolution All living organisms are related to each other through evolution This means: any pair of
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of omputer Science San José State University San José, alifornia, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Pairwise Sequence Alignment Homology
More informationOutline. Sequence-comparison methods. Buzzzzzzzz. Why compare sequences? Gerard Kleywegt Uppsala University
MB330 - January, 2006 Sequence-comparison methods erard Kleywegt Uppsala University Outline! Why compare sequences?! Dotplots! airwise sequence alignments &! Multiple sequence alignments! rofile methods!
More informationMultiple Alignment. Slides revised and adapted to Bioinformática IST Ana Teresa Freitas
n Introduction to Bioinformatics lgorithms Multiple lignment Slides revised and adapted to Bioinformática IS 2005 na eresa Freitas n Introduction to Bioinformatics lgorithms Outline Dynamic Programming
More informationMolecular Evolution, course # Final Exam, May 3, 2006
Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least
More informationMolecular Evolution and Phylogenetic Tree Reconstruction
1 4 Molecular Evolution and Phylogenetic Tree Reconstruction 3 2 5 1 4 2 3 5 Orthology, Paralogy, Inparalogs, Outparalogs Phylogenetic Trees Nodes: species Edges: time of independent evolution Edge length
More informationCladistics and Bioinformatics Questions 2013
AP Biology Name Cladistics and Bioinformatics Questions 2013 1. The following table shows the percentage similarity in sequences of nucleotides from a homologous gene derived from five different species
More informationCopyright notice. Molecular Phylogeny and Evolution. Goals of the lecture. Introduction. Introduction. December 15, 2008
opyright notice Molecular Phylogeny and volution ecember 5, 008 ioinformatics J. Pevsner pevsner@kennedykrieger.org Many of the images in this powerpoint presentation are from ioinformatics and Functional
More informationInferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT
Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions
More informationMolecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço
Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço jcarrico@fm.ul.pt Charles Darwin (1809-1882) Charles Darwin s tree of life in Notebook B, 1837-1838 Ernst Haeckel (1934-1919)
More informationPhylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches
Int. J. Bioinformatics Research and Applications, Vol. x, No. x, xxxx Phylogenies Scores for Exhaustive Maximum Likelihood and s Searches Hyrum D. Carroll, Perry G. Ridge, Mark J. Clement, Quinn O. Snell
More informationPhylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X
More informationHomology Modeling. Roberto Lins EPFL - summer semester 2005
Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,
More informationPhylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science
Phylogeny and Evolution Gina Cannarozzi ETH Zurich Institute of Computational Science History Aristotle (384-322 BC) classified animals. He found that dolphins do not belong to the fish but to the mammals.
More informationWeek 8: Testing trees, Bootstraps, jackknifes, gene frequencies
Week 8: Testing trees, ootstraps, jackknifes, gene frequencies Genome 570 ebruary, 2016 Week 8: Testing trees, ootstraps, jackknifes, gene frequencies p.1/69 density e log (density) Normal distribution:
More informationEstimating Evolutionary Trees. Phylogenetic Methods
Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent
More informationMOLECULAR SYSTEMATICS: A SYNTHESIS OF THE COMMON METHODS AND THE STATE OF KNOWLEDGE
CELLULAR & MOLECULAR BIOLOGY LETTERS http://www.cmbl.org.pl Received: 16 August 2009 Volume 15 (2010) pp 311-341 Final form accepted: 01 March 2010 DOI: 10.2478/s11658-010-0010-8 Published online: 19 March
More informationMolecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016
Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More information