! A species tree aims at representing the evolutionary relationships between species. ! Species trees and gene trees are generally related...

Similar documents
Algorithms in Bioinformatics

Evolutionary Tree Analysis. Overview

Constructing Evolutionary/Phylogenetic Trees

Phylogenetic Tree Reconstruction

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky


Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

What is Phylogenetics

Phylogenetic inference

Dr. Amira A. AL-Hosary

Phylogenetic analyses. Kirsi Kostamo

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic trees 07/10/13

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

8/23/2014. Phylogeny and the Tree of Life

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Theory of Evolution Charles Darwin

C3020 Molecular Evolution. Exercises #3: Phylogenetics

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Seuqence Analysis '17--lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony

Constructing Evolutionary/Phylogenetic Trees

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

BINF6201/8201. Molecular phylogenetic methods

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Multiple Sequence Alignment. Sequences

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi)

EVOLUTIONARY DISTANCES

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Session 5: Phylogenomics

Theory of Evolution. Charles Darwin

Chapter 27: Evolutionary Genetics

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses

Cladistics and Bioinformatics Questions 2013

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

A (short) introduction to phylogenetics

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogeny: building the tree of life

Inferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

Introduction to Bioinformatics Introduction to Bioinformatics

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

I. Short Answer Questions DO ALL QUESTIONS

Phylogenetic inference: from sequences to trees

Phylogenetic Analysis

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29):

Phylogenetic Analysis

Phylogenetic Analysis

Biology 211 (2) Week 1 KEY!

molecular evolution and phylogenetics

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Phylogeny Tree Algorithms

Introduction to characters and parsimony analysis

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

Lecture 6 Phylogenetic Inference

Chapter 19: Taxonomy, Systematics, and Phylogeny

Using Bioinformatics to Study Evolutionary Relationships Instructions

Reconstructing the history of lineages

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Laboratory. Phylogenetics

Estimating Evolutionary Trees. Phylogenetic Methods

Molecular Evolution and Phylogenetic Tree Reconstruction

C.DARWIN ( )

Comparative Genomics II

Intraspecific gene genealogies: trees grafting into networks

Chapter 26: Phylogeny and the Tree of Life

How to read and make phylogenetic trees Zuzana Starostová

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley

Phylogeny and the Tree of Life

Macroevolution Part I: Phylogenies

Biology. Slide 1 of 24. End Show. Copyright Pearson Prentice Hall

Concepts and Methods in Molecular Divergence Time Estimation

Investigation 3: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST

Chapter 26 Phylogeny and the Tree of Life

CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003

PHYLOGENY AND SYSTEMATICS

林仲彥. Dec 4,

AP Biology. Cladistics

Phylogeny: traditional and Bayesian approaches

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Unit 9: Evolution Guided Reading Questions (80 pts total)

Name: Class: Date: ID: A

How should we organize the diversity of animal life?

Phylogenetics. BIOL 7711 Computational Bioscience

Bioinformatics. Transcriptome

Transcription:

Bioinformatics Phylogeny Species trees versus molecule tree! A species tree aims at representing the evolutionary relationships between species.! A molecule tree represents the evolutionary history of a family of related molecules (genes, proteins).! Species trees and gene trees are generally related... " Species tree can be inferred from various criteria, including the history of carefully chosen molecules.!... but t identical. " A molecular family can contain several copies in the same species (in-paralogs), due to gene duplications. " Some molecules can be transferred horizontally between species. " Due to combinations of duplications-divergences, the tree of a given gene may be inconsistent with the species tree.! Illustration: Figure 7.3 from Zvelebil and Baum. Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Gémes et des Réseaux (BiGRe) http://www.bigre.ulb.ac.be/ Source: Zvelebil, M.J. and Baum, J.O. (2008) Understanding Bioinformatics. Garland Science, New York and London.! Tree reconciliation Concept definitions from Fitch (2000) Source: Zvelebil, M.J. and Baum, J.O. (2008) Understanding Bioinformatics. Garland Science, New York and London.!! Discussion about definitions of the paper " Fitch, W. M. (2000). Homology a personal view on some of the problems. Trends Genet 16, 227-31.! Homology " Owen (1843). «the same organ under every variety of form and function». " Fitch (2000). Homology is the relationship of any two characters that have descendent, usually with divergence, from a common ancestral character. Note: character can be a phetypic trait, or a site at a given position of a protein, or a whole gene,... " Molecular application: two genes are homologous if diverge from a common ancestral gene.! Analogy: relationship of two characters that have developed convergently from unrelated ancestor.! Cenancestor: the most recent common ancestor of the taxa under consideration! Orthology: relationship of any two homologous characters whose common ancestor lies in the cenancestor of the taxa from which the two were obtained.! Paralogy: Relationship of two characters arising from a duplication of the gene for that character.! Xelogy: relationship of any two characters whose history, since their common ancestor, involves interspecies (horizontal) transfer of the genetic material for at least one of those characters. Analogy Homology Paralogy Xelogy or t (xeologs from paralogs) Orthology Xelogy or t Exercise Exercise! On the basis of Fitch s definitions (previous slide), qualify the relationships between each pair of genes in the illustrative schema. " P paralog " O ortholog " X xelog " A analog! Example: B1 versus C1 " The two (B1 and C1) were obtained from taxa B and C, respectively. " The cenancestor (blue arrow) is the taxon that preceded the second speciation event (Sp2). " The common ancestor gene (green dot) coincides with the cenancestor! -> B1 and C1 are orthologs A1 AB1 B1 B2 C1 C2 C3 A1 AB1 B1 B2 C1 C2 C3 # Orthologs can fomally be defined as a speciation event (ex: a 1 and a 2 ). # Paralogs can fomally be defined as a gene duplication event (ex: b 2 and b 2' ). Source: Zvelebil & Baum, 2000 A1 AB1 B1 B2 C1 C2 C3 A1 AB1 B1 B2 C1 C2 C3 O # Orthologs can fomally be defined as a speciation event. # Paralogs can fomally be defined as a gene duplication event. # Source: Zvelebil & Baum, 2000

Exercise Solution to the exercise! Example: B1 versus C2 " The two (B1 and C2) were obtained from taxa B and C, respectively. " The common ancestor gene (green dot) is the gene that just preceded the duplication Dp1. " This common ancestor is much anterior to the cenancestor (blue arrow).! -> B1 and C2 are paralogs! On the basis of Fitch s definitions (previous slide), qualify the relationships between each pair of genes in the illustrative schema. " P paralog " O ortholog " X xelog " A analog A1 AB1 B1 B2 C1 C2 C3 A1 AB1 B1 B2 C1 C2 C3 O P # Orthologs can fomally be defined as a speciation event. # Paralogs can fomally be defined as a gene duplication event. # Source: Zvelebil & Baum, 2000 A1 AB1 B1 B2 C1 C2 C3 A1 I AB1 X I B1 O X I B2 O X P I C1 O X O P I C2 O X P O P I C3 O X P O P P I Cladistics, cladograms and clades Phylogram! Cladistics " (Greek: klados = branch) is a branch of biology that determines the evolutionary relationships between organisms based on derived similarities (source: Wilkipaedia).! Cladogram " tree-like drawing, usually with binary bifurcations, representing one evolutionary scenario about divergences between species or.! Clade " Any sub-tree of a cladogram.! Note: branch lengths to t reflect evolutionary time. YBIH ECOLI BETI ECOLI YBJK ECOLI YJDC ECOLI YIJC ECOLI TETC ECOLI YDHM ECOLI YJGJ ECOLI YCFQ ECOLI TTK ECOLI YCDC ECOLI UIDR ECOLI TER5 ECOLI TER4 ECOLI TER2 ECOLI TER3 ECOLI! Phylogram : tree-like structure representing an evolutionary scenario, and including " the events of divergence between species or ; " the evolutionary time between each species and the divergence events. ACRR ECOLI ENVR ECOLI TETC ECOLI UIDR ECOLI BETI ECOLI YBJK ECOLI YBIH ECOLI YCDC ECOLI YDHM ECOLI TER1 ECOLI TER3 ECOLI TER2 ECOLI TER4 ECOLI TER5 ECOLI YIJC ECOLI TTK ECOLI YJDC ECOLI YCFQ ECOLI TER1 ECOLI ENVR ECOLI YJGJ ECOLI ACRR ECOLI Molecular clock Phylogenetic inference from sequence comparison! The "molecular clock" hypothesis (left tree) assumes that rates of evolution do t vary between branches. All leaf des are thus aligned vertically.! This hypothesis is t always valid " in some cases, two genes can diverge from a common ancestor, but one of them may have diverged faster than the other one. This is a rather classical mechanism of evolution: a duplication creates some redundancy, and one copy of the gene will evolve whereas the other one retains the initial function. Ultrametric tree (with clock) (e.g. UPGMA) META BRUME META RHIME Q8UBY0 META CAMJE META VIBCH META YERPE META ECOLI META ECO57 META SALTI META SALTY META LACLA META STRPN AAL00238 META BACSU META THEMA META BACHD META CLOAB Without clock (e.g. neighbour-joining) META BACSU META BACHD META CLOAB META THEMA META BRUME META RHIME Q8UBY0 META LACLA META STRPN AAL00238 META CAMJE META VIBCH META YERPE META ECOLI META ECO57 META SALTI META SALTY! Alternative approaches " Maximum parsimony " Distance " Maximum likelihood Unaligned Sequence alignment Aligned strong many (> 20)? Maximum parsimy Source: Mount (2000)

Maximum parsimony Maximum parsimony example! For each column of the alignment, all possible trees are evaluated and the tree with the smallest number of mutations is retained! The trees which fit with the highest number of columns are retained! The program can return several trees position 1 2 3 4 5 6 7 8 9 seq1 A A G A G T G C A seq2 A G C C G T G C G seq3 A G A T A T C C A seq4 A G A G A T C C G Column 5 mutation seq1 G A seq3 G A seq2 G A seq4 seq 1G G seq 2 A A seq 3 A A seq 4 seq 1G G seq 2 A A seq 4 A A seq 3 +-----------CYTR_ECOLI! +--------------------------6!!! +--------EBGR_ECOLI!! +-13!!! +-----CSCR_ECOLI!! +-12!!! +--IDNR_ECOLI!! +--5!! +--GNTR_ECOLI! +--4!!! +-----MALI_ECOLI!!! +-10!!!!! +--TRER_ECOLI!!! +--------------9 +-14!!!!! +--YCJW_ECOLI!!!!!!!!! +--------LACI_ECOLI!! +--------------8! +--2! +--FRUR_ECOLI!!!! +-------15!!!!! +--RAFR_ECOLI!!! +----------11!!!! +-----ASCG_ECOLI!!! +-----7! --1!! +--GALS_ECOLI!!! +--3!!! +--GALR_ECOLI!!!!! +-----------------------------------------RBSR_ECOLI!!! +--------------------------------------------PURR_ECOLI! remember: this is an unrooted tree!!! Parsimony tree calculated from a multiple alignment of the E.coli proteins containing a laci-type HTH domain " Left: text representation (protpars output) " Bottom right: visualized with njplot (in the ClustalX distribution) Adapted from Mount (2000) requires a total of 4095.000! Maximum parsimony - drawbacks Phylogenetic inference from sequence comparison! Number of trees to evaluate increases exponentially with the number of.! Assumes that all evolved at the same rate (molecular clock hypothesis).! Only works for well conserved sequence families.! Alternative approaches " Maximum parsimony " Distance " Maximum likelihood Unaligned Sequence alignment Aligned strong many (> 20)? Maximum parsimy clear Distance Source: Mount (2000) Distance method Distance matrix! Starting from a multiple alignment, calculate the distance between each pair of! Calculate a tree which fits as well as possible with the distance matrix " branch lengths should correspond to distances " rooted or unrooted! Several methods can be used for calculating a tree from the distance matrix. " Fitch-Margoliah " Neighbour-Joining " UPGMA Aligned Distance calculation Distance matrix Tree calculation Tree! The distance matrix indicates the distance between each pair of sequence.! The matrix is symmetrical, and the diagonal only contains 0s. META_BACHD META_BACSU META_CLOAB META_STRPN AAL00238 META_LACLA META_ECOLI META_ECO57 META_SALTI META_SALTY META_YERPE META_VIBCH META_CAMJE META_THEMA META_RHIME Q8UBY0 META_BRUME META_BACHD 0.00 0.51 0.50 0.65 0.64 0.82 0.74 0.73 0.76 0.76 0.77 0.68 0.95 0.58 0.76 0.76 0.91 META_BACSU 0.51 0.00 0.66 0.81 0.80 0.90 0.86 0.85 0.88 0.88 0.85 0.80 0.87 0.65 0.99 0.98 1.05 META_CLOAB 0.50 0.66 0.00 0.75 0.74 0.79 0.81 0.82 0.83 0.83 0.85 0.82 0.82 0.60 0.79 0.80 0.96 META_STRPN 0.65 0.81 0.75 0.00 0.00 0.74 0.87 0.88 0.89 0.90 0.90 0.88 1.04 0.74 1.07 1.07 1.02 AAL00238 0.64 0.80 0.74 0.00 0.00 0.74 0.87 0.87 0.89 0.89 0.89 0.87 1.03 0.73 1.06 1.06 1.01 META_LACLA 0.82 0.90 0.79 0.74 0.74 0.00 0.93 0.93 0.96 0.95 0.94 0.95 0.99 0.78 1.11 1.15 1.07 META_ECOLI 0.74 0.86 0.81 0.87 0.87 0.93 0.00 0.02 0.06 0.05 0.24 0.46 1.04 0.81 1.03 0.97 1.08 META_ECO57 0.73 0.85 0.82 0.88 0.87 0.93 0.02 0.00 0.06 0.05 0.24 0.46 1.03 0.82 1.03 0.97 1.08 META_SALTI 0.76 0.88 0.83 0.89 0.89 0.96 0.06 0.06 0.00 0.01 0.26 0.46 1.08 0.82 1.08 1.00 1.11 META_SALTY 0.76 0.88 0.83 0.90 0.89 0.95 0.05 0.05 0.01 0.00 0.25 0.46 1.09 0.82 1.08 1.01 1.11 META_YERPE 0.77 0.85 0.85 0.90 0.89 0.94 0.24 0.24 0.26 0.25 0.00 0.43 0.94 0.84 1.06 1.04 1.08 META_VIBCH 0.68 0.80 0.82 0.88 0.87 0.95 0.46 0.46 0.46 0.46 0.43 0.00 0.96 0.72 1.07 1.00 1.10 META_CAMJE 0.95 0.87 0.82 1.04 1.03 0.99 1.04 1.03 1.08 1.09 0.94 0.96 0.00 0.97 1.15 1.12 1.31 META_THEMA 0.58 0.65 0.60 0.74 0.73 0.78 0.81 0.82 0.82 0.82 0.84 0.72 0.97 0.00 0.78 0.75 0.89 META_RHIME 0.76 0.99 0.79 1.07 1.06 1.11 1.03 1.03 1.08 1.08 1.06 1.07 1.15 0.78 0.00 9 0.55 Q8UBY0 0.76 0.98 0.80 1.07 1.06 1.15 0.97 0.97 1.00 1.01 1.04 1.00 1.12 0.75 9 0.00 0.54 META_BRUME 0.91 1.05 0.96 1.02 1.01 1.07 1.08 1.08 1.11 1.11 1.08 1.10 1.31 0.89 0.55 0.54 0.00

Trees Methods for calculating trees from a distance matrix branch de root b4 b3 Rooted tree b2 b1 seq5 seq1 seq4 seq2 seq3 b3 Unrooted tree seq5 b1 seq1 seq4 b2 seq2 seq3 Unrooted tree seq5 seq1 b1! It is usually t possible to find a tree whose branch length fit with all the values of the distance matrix.! Several approaches exist to calculate a tree which approximates the distances. " The Fitch-Margoliah method minimizes the sum of squares between distances in the matrix and distances in the tree. " The Neighbour-Joining (NJ) method minimizes the sum of branch lengths for the resulting tree. This methods does t assume a molecular clock: it is thus appropriate when some proteins have evolved faster than some other ones. It returns an unrooted tree. " The Unweighted Pair-Group Method by arithmetic Averaging (UPGMA) clusters the by order of distance in the distance matrix. This method relies on the assumption of evolutionary clock, and it produces a rooted tree. leaf des b3! The distance between two des is the sum of lengths of the branches between them b2 seq2 seq4 seq3 Example of phylogenetic tree! This tree was obtained with the Neighbour- Joining method (implemented in ClustalX).! The drawing was obtained with njplot (part of the ClustalX package)! Each branch of the tree is labelled with the distance. 0.383 16 sw P08497 LPA2_BACSU 0.242 sw P00562 AK2H_ECOLI 0.353 sw Q9ZCI7 AK_RICPR 0.318 sw Q04795 AK1_BACSU 0.030 0.265 sw P61489 AK_THETH sw P61488 AK_THET2 0.053 0.014 sw P41403 AK_MYCSM 0.010 0.079 0.063 sw P0A4Z8 AK_MYCTU 25 sw P0A4Z9 AK_MYCBO 0.025 sw Q8RQN1 AK_COREF 11 0.043 0.020 sw P26512 AK_CORGL sw P41398 AK_CORFL 35 sw P53553 AK2_BACST 16 36 0.009 sw P08495 AK2_BACSU 34 sw Q59229 AK2_BACSG 0.242 sw O25827 AK_HELPY 0.008 sw Q9ZJZ7 AK_HELPJ 0.229 0.019 sw O69077 AK_PSEAE 0.226 sw O67221 AK_AQUAE 0.234 48 sw P10869 AK_YEAST 0.226 sw O60163 AK_SCHPO 0.009 0.309 sw Q57991 AK_METJA 0.008 59 sw P37142 AKH_DAUCA 0.01588 0.098 0.070 sw P49080 AKH2_MAIZE 0.091 sw P49079 AKH1_MAIZE 0.015 0.267 sw Q89AR4 AKH_BUCBP 0.033 64 0.086 sw Q8K9U9 AKH_BUCAP 0.074 60 0.047 sw P57290 AKH_BUCAI 0.201 sw P44505 AKH_HAEIN 0.059 0.073 18 sw P27725 AK1H_SERMA 0.071 sw P00561 AK1H_ECOLI 0.409 sw P94417 AK3_BACSU 0.329 sw P08660 AK3_ECOLI 0.034 0.289 sw Q9Z6L0 AK_CHLPN 0.070 16 71 sw O84367 AK_CHLTR 21 sw Q9PK32 AK_CHLMU Distance-based methods for calculating trees in the package PHYLIP! Summary of the methods for calculating a tree from a distance matrix. Phylip program method rooted tree time accuracy remarks fitch Fitch-Margoliah O(n^4) higher loss of accuracy when the tree contains long branches kitsch Fitch-Margoliah O(n^4) higher neighbor neighbour-joining O(n^2) lower suitable when rate of evolution varies among branches neighbor UPGMA O(n^2) lower assumes constant rate of evolution along the banches Bootstrapping Phylogenetic inference from sequence comparison! In some cases, the data does t allow to infer phylogeny! To assess the reliability of the inference, one can apply the bootstrap method " Given an alignment of n and p columns, one performs a random selection of p columns, with replacement. Some columns can thus be selected multiple times, whilst some others are t selected at all. " Calculate a tree with the sampled columns. " Repeat many (e.g. ) times, and check whether the same branches occur frequently (e.g. > 70%). 788 677 509 992 338 996 304 552 766 462 221 342 sw P08497 LPA2_BACSU sw P00562 AK2H_ECOLI sw Q9ZCI7 AK_RICPR sw Q04795 AK1_BACSU sw P61489 AK_THETH sw P61488 AK_THET2 sw P41403 AK_MYCSM sw P0A4Z8 AK_MYCTU sw P0A4Z9 AK_MYCBO sw Q8RQN1 AK_COREF sw P26512 AK_CORGL sw P41398 AK_CORFL sw P53553 AK2_BACST sw P08495 AK2_BACSU 686 sw Q59229 AK2_BACSG sw O25827 AK_HELPY sw Q9ZJZ7 AK_HELPJ sw O69077 AK_PSEAE sw O67221 AK_AQUAE sw P10869 AK_YEAST sw O60163 AK_SCHPO sw Q57991 AK_METJA sw P37142 AKH_DAUCA sw P49080 AKH2_MAIZE sw P49079 AKH1_MAIZE! Alternative approaches " Maximum parsimony " Distance " Maximum likelihood Unaligned Sequence alignment Aligned strong many (> 20)? Maximum parsimy 912 994 sw Q89AR4 AKH_BUCBP sw Q8K9U9 AKH_BUCAP sw P57290 AKH_BUCAI sw P44505 AKH_HAEIN clear Distance sw P27725 AK1H_SERMA sw P00561 AK1H_ECOLI sw P94417 AK3_BACSU 990 sw P08660 AK3_ECOLI sw Q9Z6L0 AK_CHLPN sw O84367 AK_CHLTR sw Q9PK32 AK_CHLMU Maximum likelihood Source: Mount (2000)

Phylogeny.fr Practicals with phylogeny.fr! http://www.phylogeny.fr! Offers a user-friendly interface to run all the steps for inferring phylogeny from a set of unaligned. " Completely automated workflow or user-specified parameters. " Alternative methods for each step of the workflow. " Results are exported in multiple formats (convenient for using them with other programs). " Results can be displayed immediately (for fast programs) or sent by email (slow programs). Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Gémes et des Réseaux (BiGRe) http://www.bigre.ulb.ac.be/ Phylogeny.fr: sequence input! The one click option only requires for you to enter a set of, and click on the submit button. Phylogeny.fr: work flow! At each step of the workflow, you can " Check the parameters used for the analysis " Choose alternative parameters (advanced use) " Export the intermediate and final results in a variety of formats, which can then be opened in other programs. Phylogeny.fr - alignment result Phylogeny.fr - phylogenic tree in text format

Phylogeny.fr - Phylogram (various output formats are supported) Phylogeny.fr - display options Phylogram with an outgroup added (Bacillus) but t correctly rooted (midpoint grouping) Cladogram incorrectly rooted (midpoint) Phylogram rooted with an outgroup Further reading Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Gémes et des Réseaux (BiGRe) http://www.bigre.ulb.ac.be/

Further reading! Textbooks " Zvelebil, M.J. and Baum, J.O. (2008) Understanding Bioinformatics. Garland Science, New York and London.! " Mount, M. (2001) Bioinformatics: Sequence and Geme Analysis. Cold Spring Harbor Laboratory Press, New York.! " Pevzner, J. (2003) Bioinformatics and Functional Gemics. Wiley.! + all his teaching material on http://pevsnerlab.kennedykrieger.org/bioinfo_course.htm! Supplementary material Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Gémes et des Réseaux (BiGRe) http://www.bigre.ulb.ac.be/ PHYLIP flowchart Taxomy of bacteria having a gene meta (August 2004) Bacteria Bacillales Bacillaceae Bacillus Bootstrapping seqboot aligned Distance calculation protdist dnadist distance matrix Firmicutes Clostridia Lactobacillales Clostridiales Streptococcaceae Clostridium Lactococcus Streptococcus Brucella Parsimony protpars dnapars Branch-and-bound dnapenny Maximum likelihood dnaml protml Neighbor -joining neighbor UPGMA neighbor (rooted) Fitch-Margoliash fitch (unrooted) kitsch (rooted) Alpha subdivision Rhizobiaceae group Rhizobium Sirhizobium Proteobacteria Epsilon subdivision Campylobacter group Campylobacter tree Escherichia retree consense Tree drawing drawtree Tree drawing drawgram Gamma subdivision Enterobacteriaceae Salmonella Yersinia drawing of unrooted tree drawing of rooted tree Thermotogae Thermotogae (class) Vibrionaceae Thermotogales Vibrio Thermogata Tree menclature Alignment methods! Node! Leave! Internal branch! External branch Source: Zvelebil, M.J. and Baum, J.O. (2008) Understanding Bioinformatics. Garland Science, New York and London.!

Evolutionary model Source: Zvelebil, M.J. and Baum, J.O. (2008) Understanding Bioinformatics. Garland Science, New York and London.!