Phylogeny. Information. ARB-Workshop 14/ CEH Oxford. Molecular Markers. Phylogeny The Backbone of Biology. Why? Zuckerkandl and Pauling 1965

Size: px
Start display at page:

Download "Phylogeny. Information. ARB-Workshop 14/ CEH Oxford. Molecular Markers. Phylogeny The Backbone of Biology. Why? Zuckerkandl and Pauling 1965"

Transcription

1 Frank Oliver löckner Information Phylogeny Who are we: Dr. Frank Oliver löckner Dr. Jörg Peplies Max Planck Institute for Marine Microbiology Microbial enomics roup Bremen, ermany ontact: Mailinglist: RBWorkshop 4/ EH Oxford Where can you find additional information: ftp.mpibremen.de/molecol_p/arb > all files needed to install RB can be found in the EH_Oxford folder Frank Oliver löckner Phylogeny he Backbone of Biology Why? o track back the origin of organisms o unravel evolutionary relationships o sort and classify organisms Molecular Markers Zuckerkandl and Pauling 965 Use macromolecules as molecular clocks DN/RN Proteins How? Botany and Zoology Morphology Fossils Microbiology Molecular markers Problem: Species vs. enephylogeny Lateral gene transfer enome plasticity/patchwork Orthologous/paralogous genes Frank Oliver löckner Frank Oliver löckner 4 Universal ree Homology Definition Homology wo sequences are homolog when they evolved from a common ancestor sequence Homology can not be quantified! Sequences are homolog or not! Orthologous genes Direct common ancestor Paralogous genes Originates from a gene duplication Doolittle, Science 999, 84:48 Frank Oliver löckner 5 Frank Oliver löckner 6

2 Frank Oliver löckner 7 Orthologs/Paralogs species Phylogenetic Markers geneduplication speciation speciation 4 4 species B species species B 6S rrn S rrn Elongationfactors EFu EF PSynthase Reg Hsp60 RNPolymerase yrase Housekeeping enes ranscription ranslation Frank Oliver löckner 8 rrn as Phylogenetic Marker dvantages Functional constancy Ubiquitous distribution Large size (information content) onserved and highly variable structural elements No lateral gene transfer Drawbacks No continuous sequence change Multiple genes/operons Different species with identical 6S rrns One base change needs nearly one million years Steps in Phylogenetic nalysis Sequence determination lignment Data analysis Phylogenetic reconstruction Frank Oliver löckner 9 Frank Oliver löckner 0 Sequence Determination utomatic sequencers BI Prism 77 (gel) BI Prism 00 (6 capillary) BI Prism 700 (96 capillary) Megabase 500 (48 capillary) Megabase 000 (96 capillary) he European Database on Ribosomal RN Maintenance Department of Biochemistry, University ntwerpen Services SSU, LSU sequences, annotations lignments, secondary structures, variability maps WWW interface for sequence retrieval Software for alignment and tree reconstruction ontent (aligned sequences) Release September 00 0,85 SSU sequences,400 LSU sequences Frank Oliver löckner Frank Oliver löckner

3 Frank Oliver löckner RDPII Ribosomal Database Project RB Software Environment for Sequence Data Maintenance enter for Microbial Ecology, Michigan State University Services SSU, LSU sequences, annotations lignments Phylogenetic trees nalysis services via WWW server ontent (aligned sequences) RDP Preview Release from 05/05/004 97,8 SSU sequences 7 LSU sequences Maintenance Department for Microbiology, echnical University Munich Services SSU, LSU sequences, annotations lignments, Phylogenetic trees Probe design, Probe match Software suite RB ontent (aligned sequences) Prerelease July 04 59,609 SSU sequences 698 LSU sequences Frank Oliver löckner 4 lignment Problem Variable Regions lign the sequences in a way, that homologous bases will stand one below the other in a column Frank Oliver löckner 5 Frank Oliver löckner 6 he 6S Secondary Structure RBEdit proteins 6S rrn 0S subunits 70S ribosome 50S 4 proteins 5S rrn S rrn Escherichia coli 6S rrn primary and secondarystructure Frank Oliver löckner 7 Frank Oliver löckner 8

4 Frank Oliver löckner 9 Secondary Structures Secondary Structures UUUUUUUU UUUUUUUUU UUUUUU Escherichia coli Secondary structure information UUUUUUUU UUUUUUUUU UUUUUU Mycoplasma hypopneumoniae Streptococcus oralis Frank Oliver löckner 0 SSU Secondary structure Data nalysis Information ontent Size (E.coli) Information (bits) Similarity 6S rrn 54 n 084 >67% S rrn 904 n 5808 >67% EFu 94 aa 706 >60% Pase β subunit 460 aa 99 >6% onserved Variable Information (variable) Information (real) Ludwig and Klenk, Bergeys Frank Oliver löckner Frank Oliver löckner Information ontent 6S rrn haracters No % Phylogeny, 5 Oxford 004 S rrn EFu Pase βsub. No % No % No % Oliver Frank löckner Models of Evolution Models of substitution rates between bases ransition >, >, >, > ransversion >, >, >, > and reverse minoacids: PM and BLOSUM matrices Base frequencies Models of amongsite substitution rate heterogeneity Weighting particular sites according to relative mutation frequencies (position variability) Frank Oliver löckner 4

5 Frank Oliver löckner 5 Models of Evolution Jukesantor model ll substitution types and base frequencies are presumed equal ime reversible Kimura parameter model ransitions are more likely than transversions Equal base frequencies ime reversible Substitution Models reeing methods Maximum Parsimony Fixed costs matrices Distance Matrix and Maximum Likelihood eneral model of sequence evolution Not addressed Lineagespecific substitutions Different rates of evolution between lineages (a+a+a) a4 a7 a0 a (a4+a5+a6) a8 a a a5 (a7+a8+a9) a a a6 a9 (a0+a+a) a = relative rate between the different substitutions x frequency of target base Frank Oliver löckner Swofford, Book (Hillis), 996, p. 4 6 Models of Evolution eneral matrix Models of Evolution eneral ime Reversal (R) Jukes antor Kimura s parameter model (KP) Frank Oliver löckner Swofford, Book (Hillis), 996, p. 4 7 Phylogeny, Oxford Swofford, 004 Book (Hillis), 996, p. 44 Frank Oliver löckner 8 reeing Methods lassification Inferring a phylogeny is really an estimation procedure; we are making a best estimate of an evolutionary history based on incomplete information Swofford, 990 Distancebased ompute pairwise distances and use them to derive the tree haracterbased Work directly on each character of the data. Derive trees that optimize the distribution of the actual data pattern for each character Maximum Parsimony, Maximum Likelihood lgorithmbased enerate a tree according to a series of steps (e.g. neighbor joining) riterionbased Evaluation of alternative trees according to some optimization functions Frank Oliver löckner 9 Frank Oliver löckner 0

6 Frank Oliver löckner he Most ommon Methods for ree Reconstruction Distance Matrix alculation of distance matrices by binary comparison of the aligned sequences UPM or Neighbor Joining Maximum Parsimony Preservation is more likely than change Search for topologies that minimize the total tree length assuming a minimum number of base changes Maximum Likelihood Searches for the evolutionary model, including the tree itself, that has the highest likelihood of producing the observed data Models: transition/transversion; base frequencies; positional variability Definitions peripheral branch internal branch Radial tree central branch terminal nodes/tips links/edges internal nodes Dendrogram Unrooted tree: the location of the common ancestor is not specified Frank Oliver löckner Distance Matrix Ultrametric Data Distance Matrix Non Ultrametric Data UPM Unweighted Pair roup Method with rithmetic Mean Frank Oliver löckner Frank Oliver löckner 4 Distance Matrix dditive rees Example additive trees: FitchMargoliash algorithm alculate the matrix Find the most closely related pair of sequences and link it by an internal node Link the next related sequence with an internal node alculate branch length B 9 B E Frank Oliver löckner 5 to B = a+b = () to = a+c = 9 () B to = b+c = 4 () Subtract () from (), 94 = (4) dd () and (4), = 0, a = 0 From () and (), b =, c = 9 B Frank Oliver löckner 6 a b c Mount, Book, Bioinformatics 00, p. 57

7 Frank Oliver löckner 7 Principle of Neighbor Joining (Saitou and Nei, 987) he fully resolved tree is decomposed from a fully unresolved star tree by successively inserting branches between a pair of closest neighbors and the remaining terminals in the tree B H D F E star decomposition B H D F E Distance Matrix dditive rees Finding a tree that fits to the matrix Find the optimal values for the branching pattern and the branch length NJ will find the correct tree if the distances are additive Problem: Nonadditive distances caused by superimposed changes Observed distance Real distance Frank Oliver löckner 8 Dealing with nonadditive distances Instead of using raw dissimilarity correct distances based on expected numbers of hidden changes For some models (J, KP, F84) simple distance equations exist For others one must use ML Outcome: dditivity is not restored!! > Optimality criterion is needed Most widely used = leastsquares criterion (e.g., Fitch Margoliash) can lead to negative branch length Minimal Evolution (PUP) Distance Matrix orrect for Multiple hanges Jukes and antor, 969 Frank Oliver löckner 9 Frank Oliver löckner 40 Pros and ons Very fast Only one tree is derived opology and branch lengths are calculated ounts for false identities Works with different models of evolution Discards the primary character data Different sequences can yield the same matrix distance method would reconstruct the true tree if all genetic divergence events were accurately recorded in the sequence Swofford, 996 Maximum Parsimony MP is an optimality criterion that appeals to the principle: he simplest explanation of the data is the best Model of evolution: Preservation is more likely than change haracter based method Evaluates trees Selects trees that minimize the total tree length Needs a set of outgroup taxa alculations are done from the terminal nodes towards the (arbitrary) root Implicit model of evolution no additional model needed Frank Oliver löckner 4 Frank Oliver löckner 4

8 Frank Oliver löckner 4 Maximum Parsimony Evaluation of rees he alignment is checked for informative positions o be informative, a site must have the same sequence characters in at least two taxa (e.g. site,,, 5) nd they must favor one topology over another (only site 5) Only the informative sites are analyzed S / S S S4 / S S S S4 S / S4 S S / S S S S4 4 5 / mutations mutation / mutations / Frank Oliver löckner 44 Pros and ons Works directly on the data Works fine on data with strong similarity Relatively fast Does not need a model of evolution alculates only topologies Performs weakly on distantly related data Prone to false identities (multiple changes) long branch attraction an produce many trees with the same parsimony score Maximum Likelihood ML evaluates a hypothesis about evolutionary history in terms of probability that a proposed model of the evolutionary process and the hypothesized history would give rise to the observed data haracter based method oncrete model of evolution needed ssumes that nucleotide sites evolve independently Likelihood for each site is calculated separately and combined to a total value for a tree Looks for the tree with the highest likelihood; L () = maximal Frank Oliver löckner 45 Frank Oliver löckner 46 Maximum Likelihood Maximum Likelihood he likelihood of the full tree is the product of the likelihood at each site L () = L () x L () x. x L (N) = N j = L(j) Because the probability of any single observation is an extremely small number they are normally handled as logarithms For every internal node all four nucleotides are allowed > 4x4 = 6 probabilities Each probability is the product of the probability of the base in (6) and the transition/transversion probabilities e.g. prob. = 0.5 or average frequency of in the sequence (> depends on model) > transversion = 0 6 and > transition = x0 6 Likelihood of = 0.5 x x0 6 x 0 6 = 5x0 ln L () = ln L () + ln L () +. + ln L (N) = N j = lnl( j ) Frank Oliver löckner Swofford, Book (Hillis), 996, p Frank Oliver löckner 48

9 Frank Oliver löckner 49 Pros and ons MP vs. ML Works directly on the data Performs well also on distantly related data Includes models of evolutions he whole tree is under evaluation topologies and branch lengths are optimized urrently regarded as the best method omputationally intense number of sequences is limited Frank Oliver löckner Swofford, Book (Hillis), 996, p Searching for optimal trees Exact lgorithms Exhaustive search Branchandbound Methods How many trees do we have to evaluate Places to add another taxon wo taxa = Heuristic pproaches Stepwise addition Star decomposition Branch swapping hree taxa Four taxa = = 5 Five taxa = 7 Frank Oliver löckner 5 Frank Oliver löckner 5 he 5 possible unrooted trees for 5 taxa Exhaustive opologies Number of unrooted, bifurcating trees No of sequences No of trees ,95 5,5,07,05.8x0 74 B( ) = i= (i 5) he root is just another taxon so: No of sequences 4 No of trees 5 Frank Oliver löckner Swofford, Book (Hillis), 996, p Frank Oliver löckner 54

10 Frank Oliver löckner 55 Exact algorithms Search tree for BranchandBound Exhaustive (< taxa) ll trees are evaluated Branchandbound (<0 taxa) onstruct a random tree with all sequences and evaluate its value L under the chosen optimality criterion according to the reconstruction method and model used his is the initial upper bound of L Start to reconstruct trees from to X taxon by stepwise addition of taxa Evaluate each tree if the score exceeds L there is no need to go further along this path, if the score < L proceed If the score at the end of the path is less than L take this for the new upper bound Frank Oliver löckner Swofford, Book (Hillis), 996, p Heuristic pproaches lobal vs. Local Optimum lobal vs. local optimum Heuristic tree searches generally operate by hill climbing methods Start with an initial tree Optimize (rearrange) it under the chosen optimality criterion If we find no way for further improvement stop Problem: here is no way of knowing if we reached the global or merely a local optimum Frank Oliver löckner 57 Frank Oliver löckner 58 lobal vs. Local Optimum Heuristics Stepwise ddition Stepwise ddition Start with three sequences dd next taxon evaluate tree, do rearrangements Save the one with the best score add next taxon ddition order In the order of the data in the alignment Use a distance algorithm to decide order e.g. by closest taxon addition dd the taxon that makes the optimal e.g. shortest tree Random taxon addition order Frank Oliver löckner 59 Frank Oliver löckner 60

11 Frank Oliver löckner 6 Heuristics Branch swapping Heuristics NNI Branch Swapping Nearest Neighbor Interchange (NNI) Subtree pruning and recrafting (SPR) ree bisection (BR) Hoping to find a better tree by disturbing (rearranging) the tree to overcome local optima Problem: If the tree is on a plateau and the global optimum several steps away we might still not reach it Frank Oliver löckner Felsenstein, Book, 004, p. 9 6 Heuristics SPR Heuristics BR Frank Oliver löckner Felsenstein, Book, 004, p. 4 6 Frank Oliver löckner Felsenstein, Book, 004, p onfidence ests Bootstrapping Bootstrapping Resampling tree evaluation technique New data sets are created from the original data set by sampling columns of characters by random with replacement Each site can be sampled again with the same probability as any of the other sites Problem: Some positions can be over represented, some sites are missing t least 00, better,000 trees should be calculated Remember: High bootstrap values can make wrong phylogeny look good!! * * * * * * * * * *** ********* ** * *** *** * **** ***** **** **** consensus tngccatctttcacgnaacanncnctngcngaca HI attgcagtgtattggggacaaaatggaaatgaagggtctttgcaagatgc PSHI atagctgtttactggggccaaaacggtggagaaggatccttagcagacac NIDL atagtaatatattggggccaaaatgggaatgaaggtagcttagctgacac S6608 attgtcatatactggggccaaaatggtgatgaaggaagtcttgctgacac USSEQ_ atcgccatctattggggccaaaacggcaacgaaggctctcttgcatccac USSEQ_ atcgccatctattggggtcaaaacggcaacgagggctctcttgcatccac USSEQ_ atcggcatctattggggccaaaacggcaacgaaggctctcttgcatccac VIRE atttccgtctactggggtcaaaacggtaacgagggctccctggccgacgc VURNH auuuccgucuacuggggucaaaacggcaacgagggcucucuggccgacgc HHI atagccatctattggggccaaaacggaaacgaaggtaacctctctgccac VURNHB auagccaucuacuggggccaaaacggcaacgagggaacgcuuuccgaagc NBSIL attgtagtctattggggccaagatgtaggagaaggtaaattgattgacac Frank Oliver löckner 65 Frank Oliver löckner 66

12 Frank Oliver löckner 67 Bootstrapping Why do trees differ? Information content Sequencing errors lignment homology of characters Nonadditive data (false identities) Different and simplified models of evolution Independence of data Lineage and/or positionspecific rate of evolution Data selection Only subsets of organisms and positions alculation heuristics Small amount of evaluated trees strong dependence on the order of input data Local or global optimum? Frank Oliver löckner 68 onsensus trees Practical implications Filters Filters: Remove or weight down individual alignment columns while treeing Keep balance between data loss and gain of accuracy E.g. 50% conservation filter olumns in the alignment are only considered for tree reconstruction, when at least 50% of the sequences show the same residue Position variability he position variability for every column is calculated and shown as numbers 9 and characters Z means highly variable Z means extremely conserved (never seen) Frank Oliver löckner 69 Frank Oliver löckner 70 RB Filter Practical implications Outgroup hose as many sequences for the outgroup as possible hey should not be too far related to the group of interest Pic RB Phylo Data Use always the largest dataset available if necessary remove sequences after the calculation ompare different algorithms Reject problematic data Never reconstruct trees or filters on partial sequence data Frank Oliver löckner 7 Frank Oliver löckner 7

13 Frank Oliver löckner 7 RB Internal rchitecture Probefunctions Database Databasemanagement he concept of RB Probe_Design Probe_Match request update request lignment possible probes matching sequences next relative PServer Sequencealignment Phylogenetic reconstructions Frank Oliver löckner 74 PServer Do not overdo it Not delivered with RB Different format of your database for faster performance of sequence search functions within RB It is only used to search the next relative for the automatic aligner and for Probe_Design/Probe_Match reating/updating takes a long time and a lot of memory Once it has been created searching is very fast Frank Oliver löckner 75 Frank Oliver löckner 76

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

Phylogeny Tree Algorithms

Phylogeny Tree Algorithms Phylogeny Tree lgorithms Jianlin heng, PhD School of Electrical Engineering and omputer Science University of entral Florida 2006 Free for academic use. opyright @ Jianlin heng & original sources for some

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

Theory of Evolution. Charles Darwin

Theory of Evolution. Charles Darwin Theory of Evolution harles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (8-6) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

More information

Phylogeny. Properties of Trees. Properties of Trees. Trees represent the order of branching only. Phylogeny: Taxon: a unit of classification

Phylogeny. Properties of Trees. Properties of Trees. Trees represent the order of branching only. Phylogeny: Taxon: a unit of classification Multiple sequence alignment global local Evolutionary tree reconstruction Pairwise sequence alignment (global and local) Substitution matrices Gene Finding Protein structure prediction N structure prediction

More information

Evolutionary Tree Analysis. Overview

Evolutionary Tree Analysis. Overview CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based

More information

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony ioinformatics -- lecture 9 Phylogenetic trees istance-based tree building Parsimony (,(,(,))) rees can be represented in "parenthesis notation". Each set of parentheses represents a branch-point (bifurcation),

More information

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

More information

Inferring Molecular Phylogeny

Inferring Molecular Phylogeny Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction

More information

How to read and make phylogenetic trees Zuzana Starostová

How to read and make phylogenetic trees Zuzana Starostová How to read and make phylogenetic trees Zuzana Starostová How to make phylogenetic trees? Workflow: obtain DNA sequence quality check sequence alignment calculating genetic distances phylogeny estimation

More information

Theory of Evolution Charles Darwin

Theory of Evolution Charles Darwin Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods

More information

Phylogenetics. BIOL 7711 Computational Bioscience

Phylogenetics. BIOL 7711 Computational Bioscience Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istance-based vs. character-based methods. istance-based: Ultrametric Trees dditive Trees. haracter-based: Perfect phylogeny

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees 1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

More information

(Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise

(Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise Bot 421/521 PHYLOGENETIC ANALYSIS I. Origins A. Hennig 1950 (German edition) Phylogenetic Systematics 1966 B. Zimmerman (Germany, 1930 s) C. Wagner (Michigan, 1920-2000) II. Characters and character states

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

More information

Thanks to Paul Lewis, Jeff Thorne, and Joe Felsenstein for the use of slides

Thanks to Paul Lewis, Jeff Thorne, and Joe Felsenstein for the use of slides hanks to Paul Lewis, Jeff horne, and Joe Felsenstein for the use of slides Hennigian logic reconstructs the tree if we know polarity of characters and there is no homoplasy UPM infers a tree from a distance

More information

Seuqence Analysis '17--lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony

Seuqence Analysis '17--lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony Seuqence nalysis '17--lecture 10 Trees types of trees Newick notation UPGM Fitch Margoliash istance vs Parsimony Phyogenetic trees What is a phylogenetic tree? model of evolutionary relationships -- common

More information

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction

More information

Molecular Evolution & Phylogenetics

Molecular Evolution & Phylogenetics Molecular Evolution & Phylogenetics Heuristics based on tree alterations, maximum likelihood, Bayesian methods, statistical confidence measures Jean-Baka Domelevo Entfellner Learning Objectives know basic

More information

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D 7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood

More information

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5. Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony

More information

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

What is Phylogenetics

What is Phylogenetics What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)

More information

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057 Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Phylogenetic analyses. Kirsi Kostamo

Phylogenetic analyses. Kirsi Kostamo Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

A (short) introduction to phylogenetics

A (short) introduction to phylogenetics A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field

More information

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels

More information

Letter to the Editor. Department of Biology, Arizona State University

Letter to the Editor. Department of Biology, Arizona State University Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona

More information

Phylogenetics Todd Vision Spring Some applications. Uncultured microbial diversity

Phylogenetics Todd Vision Spring Some applications. Uncultured microbial diversity Phylogenetics Todd Vision Spring 2008 Tree basics Sequence alignment Inferring a phylogeny Neighbor joining Maximum parsimony Maximum likelihood Rooting trees and measuring confidence Software and file

More information

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods

More information

Sequence Alignment (chapter 6)

Sequence Alignment (chapter 6) Sequence lignment (chapter 6) he biological problem lobal alignment Local alignment Multiple alignment Introduction to bioinformatics, utumn 6 Background: comparative genomics Basic question in biology:

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

Consistency Index (CI)

Consistency Index (CI) Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)

More information

Background: comparative genomics. Sequence similarity. Homologs. Similarity vs homology (2) Similarity vs homology. Sequence Alignment (chapter 6)

Background: comparative genomics. Sequence similarity. Homologs. Similarity vs homology (2) Similarity vs homology. Sequence Alignment (chapter 6) Sequence lignment (chapter ) he biological problem lobal alignment Local alignment Multiple alignment Background: comparative genomics Basic question in biology: what properties are shared among organisms?

More information

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi DNA Phylogeny Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics :

More information

Effects of Gap Open and Gap Extension Penalties

Effects of Gap Open and Gap Extension Penalties Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See

More information

Phylogenetic trees 07/10/13

Phylogenetic trees 07/10/13 Phylogenetic trees 07/10/13 A tree is the only figure to occur in On the Origin of Species by Charles Darwin. It is a graphical representation of the evolutionary relationships among entities that share

More information

Finding the best tree by heuristic search

Finding the best tree by heuristic search Chapter 4 Finding the best tree by heuristic search If we cannot find the best trees by examining all possible trees, we could imagine searching in the space of possible trees. In this chapter we will

More information

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9 Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic

More information

Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Basic Terminology. Looking at Trees. Basic Terminology.

Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Basic Terminology. Looking at Trees. Basic Terminology. Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Phylogeny: asic erminology Outline: erminology Phylogenetic tree: Methods Problems parsimony maximum likelihood bootstrapping

More information

Phylogeny. November 7, 2017

Phylogeny. November 7, 2017 Phylogeny November 7, 2017 Phylogenetics Phylon = tribe/race, genetikos = relative to birth Phylogenetics: study of evolutionary relationships among organisms, sequences, or anything in between Related

More information

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics CS5263 Bioinformatics Guest Lecture Part II Phylogenetics Up to now we have focused on finding similarities, now we start focusing on differences (dissimilarities leading to distance measures). Identifying

More information

Concepts and Methods in Molecular Divergence Time Estimation

Concepts and Methods in Molecular Divergence Time Estimation Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Lecture : p he biological problem p lobal alignment p Local alignment p Multiple alignment 6 Background: comparative genomics p Basic question in biology: what properties

More information

Phylogeny: building the tree of life

Phylogeny: building the tree of life Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

More information

Inferring Molecular Phylogeny

Inferring Molecular Phylogeny r. Walter Salzburger The tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 2 1. Molecular Markers Inferring Molecular Phylogeny 3 Immunological comparisons! Nuttall

More information

A phylogenetic view on RNA structure evolution

A phylogenetic view on RNA structure evolution 3 2 9 4 7 3 24 23 22 8 phylogenetic view on RN structure evolution 9 26 6 52 7 5 6 37 57 45 5 84 63 86 77 65 3 74 7 79 8 33 9 97 96 89 47 87 62 32 34 42 73 43 44 4 76 58 75 78 93 39 54 82 99 28 95 52 46

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Consensus Methods. * You are only responsible for the first two

Consensus Methods. * You are only responsible for the first two Consensus Trees * consensus trees reconcile clades from different trees * consensus is a conservative estimate of phylogeny that emphasizes points of agreement * philosophy: agreement among data sets is

More information

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29):

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29): Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29): Statistical estimation of models of sequence evolution Phylogenetic inference using maximum likelihood:

More information

Midterm Exam #1. MB 451 Microbial Diversity. Honor pledge: I have neither given nor received unauthorized aid on this test.

Midterm Exam #1. MB 451 Microbial Diversity. Honor pledge: I have neither given nor received unauthorized aid on this test. Midterm xam #1 M 451 Microbial iversity Honor pledge: I have neither given nor received unauthorized aid on this test. Signed : ate : Feb 5, 2007 Name : KY 1. What are the three primary evolutionary branches

More information

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana

More information

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression) Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures

More information

Phylogenetic inference: from sequences to trees

Phylogenetic inference: from sequences to trees W ESTFÄLISCHE W ESTFÄLISCHE W ILHELMS -U NIVERSITÄT NIVERSITÄT WILHELMS-U ÜNSTER MM ÜNSTER VOLUTIONARY FUNCTIONAL UNCTIONAL GENOMICS ENOMICS EVOLUTIONARY Bioinformatics 1 Phylogenetic inference: from sequences

More information

Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive. Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then

More information

Multiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:

Multiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17: Multiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:50 5001 5 Multiple Sequence Alignment The first part of this exposition is based on the following sources, which are recommended reading:

More information

Bootstrapping and Tree reliability. Biol4230 Tues, March 13, 2018 Bill Pearson Pinn 6-057

Bootstrapping and Tree reliability. Biol4230 Tues, March 13, 2018 Bill Pearson Pinn 6-057 Bootstrapping and Tree reliability Biol4230 Tues, March 13, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Rooting trees (outgroups) Bootstrapping given a set of sequences sample positions randomly,

More information

C.DARWIN ( )

C.DARWIN ( ) C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships

More information

X X (2) X Pr(X = x θ) (3)

X X (2) X Pr(X = x θ) (3) Notes for 848 lecture 6: A ML basis for compatibility and parsimony Notation θ Θ (1) Θ is the space of all possible trees (and model parameters) θ is a point in the parameter space = a particular tree

More information

Phylogenetics: Parsimony

Phylogenetics: Parsimony 1 Phylogenetics: Parsimony COMP 571 Luay Nakhleh, Rice University he Problem 2 Input: Multiple alignment of a set S of sequences Output: ree leaf-labeled with S Assumptions Characters are mutually independent

More information

A Fitness Distance Correlation Measure for Evolutionary Trees

A Fitness Distance Correlation Measure for Evolutionary Trees A Fitness Distance Correlation Measure for Evolutionary Trees Hyun Jung Park 1, and Tiffani L. Williams 2 1 Department of Computer Science, Rice University hp6@cs.rice.edu 2 Department of Computer Science

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

Sequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University

Sequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University Sequence Alignment: A General Overview COMP 571 - Fall 2010 Luay Nakhleh, Rice University Life through Evolution All living organisms are related to each other through evolution This means: any pair of

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of omputer Science San José State University San José, alifornia, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Pairwise Sequence Alignment Homology

More information

Outline. Sequence-comparison methods. Buzzzzzzzz. Why compare sequences? Gerard Kleywegt Uppsala University

Outline. Sequence-comparison methods. Buzzzzzzzz. Why compare sequences? Gerard Kleywegt Uppsala University MB330 - January, 2006 Sequence-comparison methods erard Kleywegt Uppsala University Outline! Why compare sequences?! Dotplots! airwise sequence alignments &! Multiple sequence alignments! rofile methods!

More information

Multiple Alignment. Slides revised and adapted to Bioinformática IST Ana Teresa Freitas

Multiple Alignment. Slides revised and adapted to Bioinformática IST Ana Teresa Freitas n Introduction to Bioinformatics lgorithms Multiple lignment Slides revised and adapted to Bioinformática IS 2005 na eresa Freitas n Introduction to Bioinformatics lgorithms Outline Dynamic Programming

More information

Molecular Evolution, course # Final Exam, May 3, 2006

Molecular Evolution, course # Final Exam, May 3, 2006 Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least

More information

Molecular Evolution and Phylogenetic Tree Reconstruction

Molecular Evolution and Phylogenetic Tree Reconstruction 1 4 Molecular Evolution and Phylogenetic Tree Reconstruction 3 2 5 1 4 2 3 5 Orthology, Paralogy, Inparalogs, Outparalogs Phylogenetic Trees Nodes: species Edges: time of independent evolution Edge length

More information

Cladistics and Bioinformatics Questions 2013

Cladistics and Bioinformatics Questions 2013 AP Biology Name Cladistics and Bioinformatics Questions 2013 1. The following table shows the percentage similarity in sequences of nucleotides from a homologous gene derived from five different species

More information

Copyright notice. Molecular Phylogeny and Evolution. Goals of the lecture. Introduction. Introduction. December 15, 2008

Copyright notice. Molecular Phylogeny and Evolution. Goals of the lecture. Introduction. Introduction. December 15, 2008 opyright notice Molecular Phylogeny and volution ecember 5, 008 ioinformatics J. Pevsner pevsner@kennedykrieger.org Many of the images in this powerpoint presentation are from ioinformatics and Functional

More information

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions

More information

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço jcarrico@fm.ul.pt Charles Darwin (1809-1882) Charles Darwin s tree of life in Notebook B, 1837-1838 Ernst Haeckel (1934-1919)

More information

Phylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches

Phylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches Int. J. Bioinformatics Research and Applications, Vol. x, No. x, xxxx Phylogenies Scores for Exhaustive Maximum Likelihood and s Searches Hyrum D. Carroll, Perry G. Ridge, Mark J. Clement, Quinn O. Snell

More information

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X

More information

Homology Modeling. Roberto Lins EPFL - summer semester 2005

Homology Modeling. Roberto Lins EPFL - summer semester 2005 Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,

More information

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science Phylogeny and Evolution Gina Cannarozzi ETH Zurich Institute of Computational Science History Aristotle (384-322 BC) classified animals. He found that dolphins do not belong to the fish but to the mammals.

More information

Week 8: Testing trees, Bootstraps, jackknifes, gene frequencies

Week 8: Testing trees, Bootstraps, jackknifes, gene frequencies Week 8: Testing trees, ootstraps, jackknifes, gene frequencies Genome 570 ebruary, 2016 Week 8: Testing trees, ootstraps, jackknifes, gene frequencies p.1/69 density e log (density) Normal distribution:

More information

Estimating Evolutionary Trees. Phylogenetic Methods

Estimating Evolutionary Trees. Phylogenetic Methods Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent

More information

MOLECULAR SYSTEMATICS: A SYNTHESIS OF THE COMMON METHODS AND THE STATE OF KNOWLEDGE

MOLECULAR SYSTEMATICS: A SYNTHESIS OF THE COMMON METHODS AND THE STATE OF KNOWLEDGE CELLULAR & MOLECULAR BIOLOGY LETTERS http://www.cmbl.org.pl Received: 16 August 2009 Volume 15 (2010) pp 311-341 Final form accepted: 01 March 2010 DOI: 10.2478/s11658-010-0010-8 Published online: 19 March

More information

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016 Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information