Copyright notice. Molecular Phylogeny and Evolution. Goals of the lecture. Introduction. Introduction. December 15, 2008

Size: px
Start display at page:

Download "Copyright notice. Molecular Phylogeny and Evolution. Goals of the lecture. Introduction. Introduction. December 15, 2008"

Transcription

1 opyright notice Molecular Phylogeny and volution ecember 5, 008 ioinformatics J. Pevsner Many of the images in this powerpoint presentation are from ioinformatics and Functional Genomics by J Pevsner (ISN ). opyright 003 by Wiley. These images and materials may not be used without permission from the publisher. Visit Five kingdom system (Haeckel, 879) animals plants fungi protists monera mammals vertebrates invertebrates protozoa Page 39 Introduction to evolution and phylogeny Nomenclature of trees Goals of the lecture Five stages of molecular phylogeny: [] selecting sequences [] multiple sequence alignment [3] models of substitution [4] tree-building [5] tree evaluation Introduction Introduction harles arwin s 859 book (On the Origin of Species y Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life) introduced the theory of evolution. To arwin, the struggle for existence induces a natural selection. Offspring are dissimilar from their parents (that is, variability exists), and individuals that are more fit for a given environment are selected for. In this way, over long periods of time, species evolve. Groups of organisms change over time so that descendants differ structurally and functionally from their ancestors. t the molecular level, evolution is a process of mutation with selection. Molecular evolution is the study of changes in genes and proteins throughout different branches of the tree of life. Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features between organisms. Today, molecular sequence data are also used for phylogenetic analyses. Page 357 Page 358

2 Historical background Mature insulin consists of an chain and chain heterodimer connected by disulphide bridges Studies of molecular evolution began with the first sequencing of proteins, beginning in the 950s. In 953 Frederick Sanger and colleagues determined the primary amino acid sequence of insulin. (The accession number of human insulin is NP_00098) Page 358 The signal peptide and peptide are cleaved, and their sequences display fewer functional constraints. Fig.. Page 359 Fig.. Page 359 Note the sequence divergence in the disulfide loop region of the chain Fig.. Page 359 Historical background: insulin y the 950s, it became clear that amino acid substitutions occur nonrandomly. For example, Sanger and colleagues noted that most amino acid changes in the insulin chain are restricted to a disulfide loop region. Such differences are called neutral changes (Kimura, 98; Jukes and antor, 99). 0. x 0-9 Subsequent studies at the N level showed that rate of nucleotide (and of amino acid) substitution is about sixto ten-fold higher in the peptide, relative to the and chains. x x 0-9 Page 358 Number of nucleotide substitutions/site/year Fig.. Page 359

3 Historical background: insulin Guinea pig and coypu insulin have undergone an extremely rapid rate of evolutionary change Surprisingly, insulin from the guinea pig (and from the related coypu) evolve seven times faster than insulin from other species. Why? The answer is that guinea pig and coypu insulin do not bind two zinc ions, while insulin molecules from most other species do. There was a relaxation on the structural constraints of these molecules, and so the genes diverged rapidly. Page 30 rrows indicate positions at which guinea pig insulin ( chain and chain) differs from both human and mouse Fig.. Page 359 Molecular clock hypothesis Molecular clock hypothesis In the 90s, sequence data were accumulated for small, abundant proteins such as globins, cytochromes c, and fibrinopeptides. Some proteins appeared to evolve slowly, while others evolved rapidly. Linus Pauling, manuel Margoliash and others proposed the hypothesis of a molecular clock: s an example, Richard ickerson (97) plotted data from three protein families: cytochrome c, hemoglobin, and fibrinopeptides. The x-axis shows the divergence times of the species, estimated from paleontological data. The y-axis shows m, the corrected number of amino acid changes per 00 residues. For every given protein, the rate of molecular evolution is approximately constant in all evolutionary lineages Page 30 n is the observed number of amino acid changes per 00 residues, and it is corrected to m to account for changes that occur but are not observed. N 00 = e-(m/00) Page 30 Molecular clock hypothesis: conclusions corrected amino acid changes per 00 residues (m) Millions of years since divergence ickerson (97) Fig..3 Page 3 ickerson drew the following conclusions: For each protein, the data lie on a straight line. Thus, the rate of amino acid substitution has remained constant for each protein. The average rate of change differs for each protein. The time for a % change to occur between two lines of evolution is 0 MY (cytochrome c), 5.8 MY (hemoglobin), and. MY (fibrinopeptides). The observed variations in rate of change reflect functional constraints imposed by natural selection. Page 3 3

4 Molecular clock hypothesis: implications If protein sequences evolve at constant rates, they can be used to estimate the times that species diverged. This is analogous to dating geological specimens by radioactive decay. Positive and negative selection arwin s theory of evolution suggests that, at the phenotypic level, traits in a population that enhance survival are selected for, while traits that reduce fitness are selected against. For example, among a group of giraffes millions of years in the past, those giraffes that had longer necks were able to reach higher foliage and were more reproductively successful than their shorternecked group members, that is, the taller giraffes were selected for. In the mid-0 th century, a conventional view was that molecular sequences are routinely subject to positive (or negative) selection. Page 3 Positive and negative selection Tajima s relative rate test in MG arwin s theory of evolution suggests that, at the phenotypic level, traits in a population that enhance survival are selected for, while traits that reduce fitness are selected against. For example, among a group of giraffes millions of years in the past, those giraffes that had longer necks were able to reach higher foliage and were more reproductively successful than their shorternecked group members, that is, the taller giraffes were selected for. Positive selection occurs when a sequence undergoes significantly increased rates of substitution, while negative selection occurs when a sequence undergoes change slowly. Otherwise, selection is neutral. Tajima s relative rate test Neutral theory of evolution n often-held view of evolution is that just as organisms propagate through natural selection, so also N and protein molecules are selected for. ccording to Motoo Kimura s 98 neutral theory of molecular evolution, the vast majority of N changes are not selected for in a arwinian sense. The main cause of evolutionary change is random drift of mutant alleles that are selectively neutral (or nearly neutral). Positive arwinian selection does occur, but it has a limited role. s an example, the divergent peptide of insulin changes according to the neutral mutation rate. Page 33 4

5 Goals of molecular phylogeny Was the quagga (now extinct) more like a zebra or a horse? Phylogeny can answer questions such as: How many genes are related to my favorite gene? Was the extinct quagga more like a zebra or a horse? Was arwin correct that humans are closest to chimps and gorillas? How related are whales, dolphins & porpoises to cows? Where and when did HIV originate? What is the history of life on earth? Goals of the lecture Molecular phylogeny: nomenclature of trees Introduction to evolution and phylogeny Nomenclature of trees Five stages of molecular phylogeny: [] selecting sequences [] multiple sequence alignment [3] models of substitution [4] tree-building [5] tree evaluation There are two main kinds of information inherent to any tree: topology and branch lengths. We will now describe the parts of a tree. Page 3 Molecular phylogeny uses trees to depict evolutionary relationships among organisms. These trees are based upon N and protein sequence data. Tree nomenclature taxon taxon G I F H G I F H time one unit time one unit Fig..4 Page 3 Fig..4 Page 3 5

6 Tree nomenclature Tree nomenclature operational taxonomic unit (OTU) such as a protein sequence taxon I G time F H one unit branch (edge) I G Node (intersection or terminating point of two or more branches) time F H one unit Fig..4 Page 3 Fig..4 Page 3 Tree nomenclature Tree nomenclature ranches are unscaled... I G time F H OTUs are neatly aligned, and nodes reflect time ranches are scaled... one unit branch lengths are proportional to number of amino acid changes bifurcating internal node I G time F H multifurcating internal node one unit Fig..4 Page 3 Fig..5 Page 37 xamples of multifurcation: failure to resolve the branching order of some metazoans and protostomes Tree nomenclature: clades lade F (monophyletic group) I G F H time Rokas. et al., nimal volution and the Molecular Signature of Radiations ompressed in Time, Science 30:933 (005), Fig.. Fig..4 Page 3

7 Tree nomenclature Tree nomenclature lade F/H/G I G F H lade H I G F H time time Fig..4 Page 3 Fig..4 Page 3 xamples of clades Tree roots The root of a phylogenetic tree represents the common ancestor of the sequences. Some trees are unrooted, and thus do not specify the common ancestor. tree can be rooted using an outgroup (that is, a taxon known to be distantly related from all other OTUs). Lindblad-Toh et al., Nature 438: 803 (005), fig. 0 Page 38 Tree nomenclature: roots Tree nomenclature: outgroup rooting past present Rooted tree (specifies evolutionary path) Unrooted tree 4 5 Fig.. Page 38 past present Rooted tree root 5 Outgroup (used to place the root) Fig.. Page 38 7

8 numerating trees avalii-sforza and dwards (97) derived the number of possible unrooted trees (N U ) for n OTUs (n > 3): (n-5)! N U = n-3 (n-3)! The number of bifurcating rooted trees (N R ) (n-3)! N R = n- (n-)! For 0 OTUs (e.g. 0 N or protein sequences), the number of possible rooted trees is 34 million, and the number of unrooted trees is million. Many tree-making algorithms can exhaustively examine every possible tree for up to ten to twelve sequences. Page 38 Numbers of trees Number Number of Number of of OTUs rooted trees unrooted trees ,459, x 0 x 0 0 ox - Page 39 Species trees versus gene/protein trees Species trees versus gene/protein trees Molecular evolutionary studies can be complicated by the fact that both species and genes evolve. speciation usually occurs when a species becomes reproductively isolated. In a species tree, each internal node represents a speciation event. Genes (and proteins) may duplicate or otherwise evolve before or after any given speciation event. The topology of a gene (or protein) based tree may differ from the topology of a species tree. Page 370 past present species species speciation event Fig..9 Page 37 Species trees versus gene/protein trees Species trees versus gene/protein trees Gene duplication events speciation event Gene duplication events speciation event species species species species OTUs Fig..9 Page 37 Fig..9 Page 37 8

9 Introduction to evolution and phylogeny Nomenclature of trees Goals of the lecture Five stages of molecular phylogeny: [] selecting sequences [] multiple sequence alignment [3] models of substitution [4] tree-building [5] tree evaluation Stage : Use of N, RN, or protein For some phylogenetic studies, it may be preferable to use protein instead of N sequences. We saw that in pairwise alignment and in LST searching, protein is often more informative than N (hapter 3). Proteins have 0 states (amino acids) instead of only four for N, so there is a stronger phylogenetic signal. Page 37 Stage : Use of N, RN, or protein For phylogeny, N can be more informative. --The protein-coding portion of N has synonymous and nonsynonymous substitutions. Thus, some N changes do not have corresponding protein changes. Page 37 Fig..0 Page 373 Stage : Use of N, RN, or protein For phylogeny, N can be more informative. --The protein-coding portion of N has synonymous and nonsynonymous substitutions. Thus, some N changes do not have corresponding protein changes. Stage : Use of N, RN, or protein You can measure the synonymous and nonsynonymous substitution rates by pasting your fasta-formatted sequences into the SNP program at the Los lamos National Labs HIV database (hiv-web.lanl.gov/). If the synonymous substitution rate (d S ) is greater than the nonsynonymous substitution rate (d N ), the N sequence is under negative (purifying) selection. This limits change in the sequence (e.g. insulin chain). If d S < d N, positive selection occurs. For example, a duplicated gene may evolve rapidly to assume new functions. Page 37 9

10 Stage : Use of N, RN, or protein For phylogeny, N can be more informative. --Some substitutions in a N sequence alignment can be directly observed: single nucleotide substitutions, sequential substitutions, coincidental substitutions. Page 37 Fig.. Page 374 Stage : Use of N, RN, or protein For phylogeny, N can be more informative. --Some substitutions in a N sequence alignment can be directly observed: single nucleotide substitutions, sequential substitutions, coincidental substitutions. dditional mutational events can be inferred by analysis of ancestral sequences. These changes include parallel substitutions, convergent substitutions, and back substitutions. Fig.. Page 374 Page 37 Stage : Use of N, RN, or protein Models of nucleotide substitution For phylogeny, N can be more informative. --Noncoding regions (such as 5 and 3 untranslated regions) may be analyzed using molecular phylogeny. transition G --Pseudogenes (nonfunctional genes) are studied by molecular phylogeny transversion transversion --Rates of transitions and transversions can be measured. Transitions: purine ( G) or pyrimidine ( T) substitutions Transversion: purine pyrimidine Page 37 transition T Fig..4 Page 379 0

11 MG outputs transition and transversion frequencies MG outputs transition and transversion frequencies For primate mitochondrial N, the ratio of transitions to transversions is particularly high Goals of the lecture Stage : Multiple sequence alignment Introduction to evolution and phylogeny Nomenclature of trees Five stages of molecular phylogeny: [] selecting sequences [] multiple sequence alignment [3] models of substitution [4] tree-building [5] tree evaluation The fundamental basis of a phylogenetic tree is a multiple sequence alignment. (If there is a misalignment, or if a nonhomologous sequence is included in the alignment, it will still be possible to generate a tree.) onsider the following alignment of 3 orthologous retinol-binding proteins. Page 375 Fig..3 Page 37 Some positions of the multiple sequence alignment are invariant (arrow ). Some positions distinguish fish RP from all other RPs (arrow 3). Fig..3 Page 37

12 Stage : Multiple sequence alignment [] onfirm that all sequences are homologous [] djust gap creation and extension penalties as needed to optimize the alignment [3] Restrict phylogenetic analysis to regions of the multiple sequence alignment for which data are available for all taxa (delete columns having incomplete data). [4] Many experts recommend that you delete any column of an alignment that contains gaps (even if the gap occurs in only one taxon) Introduction to evolution and phylogeny Nomenclature of trees Goals of the lecture Five stages of molecular phylogeny: [] selecting sequences [] multiple sequence alignment [3] models of substitution [4] tree-building [5] tree evaluation In this example, note that four RPs are from fish, while the others are vertebrates that evolved more recently. Page 375 Stage 3: Tree-building models: distance Stage 3: Tree-building models: distance The simplest approach to measuring distances between sequences is to align pairs of sequences, and then to count the number of differences. The degree of divergence is called the Hamming distance. For an alignment of length N with n sites at which there are differences, the degree of divergence is: = n / N The simplest approach to measuring distances between sequences is to align pairs of sequences, and then to count the number of differences. The degree of divergence is called the Hamming distance. For an alignment of length N with n sites at which there are differences, the degree of divergence is: = n / N ut observed differences do not equal genetic distance! Genetic distance involves mutations that are not observed directly (see earlier figure). Page 378 Page 378 Stage 3: Tree-building models: distance Models of nucleotide substitution Jukes and antor (99) proposed a corrective formula: = (- 3 ) ln ( 4 p) 4 3 transition G This model describes the probability that one nucleotide will change into another. It assumes that each residue is equally likely to change into any other (i.e. the rate of transversions equals the rate of transitions). In practice, the transition is typically greater than the transversion rate. Page 379 transversion transition T transversion Fig..4 Page 379

13 Jukes and antor one-parameter model of nucleotide substitution (α=β) Kimura model of nucleotide substitution (assumes α β) α G α G α β α α α β β β T α T α Fig..4 Page 379 Fig..4 Page 379 Stage 3: Tree-building models: distance Jukes and antor (99) proposed a corrective formula: 3 4 = (- ) ln ( p) 4 3 Page 379 Stage 3: Tree-building models: distance Jukes and antor (99) proposed a corrective formula: 3 4 = (- ) ln ( p) 4 3 onsider an alignment where 3/0 aligned residues differ. The normalized Hamming distance is 3/0 = The Jukes-antor correction is 3 4 = (- ) ln ( 0.05) = When 30/0 aligned residues differ, the Jukes-antor correction is more substantial: 3 4 = (- ) ln ( 0.5) = Page 379 Use MG to display a pairwise distance matrix of 3 globins 3

14 Gamma models account for unequal substitution rates across variable sites Page 37 α = 0.5 α = α = 5 Goals of the lecture Introduction to evolution and phylogeny Nomenclature of trees Five stages of molecular phylogeny: [] selecting sequences [] multiple sequence alignment [3] models of substitution [4] tree-building [5] tree evaluation 4

15 Stage 4: Tree-building methods We will discuss two tree-building methods: distance-based and character-based. istance-based methods involve a distance metric, such as the number of amino acid changes between the sequences, or a distance score. xamples of distance-based algorithms are UPGM and neighbor-joining. Stage 4: Tree-building methods istance-based methods involve a distance metric, such as the number of amino acid changes between the sequences, or a distance score. xamples of distance-based algorithms are UPGM and neighbor-joining. haracter-based methods include maximum parsimony and maximum likelihood. Parsimony analysis involves the search for the tree with the fewest amino acid (or nucleotide) changes that account for the observed differences between taxa. Page 377 Page 377 Stage 4: Tree-building methods We can introduce distance-based and character-based tree-building methods by referring to a tree of 3 orthologous retinol-binding proteins, and the multiple sequence alignment from which the tree was generated. common carp zebrafish rainbow trout teleost Orthologs: members of a gene (protein) family in various organisms. This tree shows RP orthologs. frican clawed frog chicken human horse pig cow mouse rat rabbit Page changes Page 43 common carp zebrafish rainbow trout Fish RP orthologs teleost frican clawed frog human horse pig cow chicken mouse rat rabbit Other vertebrate RP orthologs 0 changes Page 43 Fig..3 Page 37 5

16 istance-based tree alculate the pairwise alignments; if two sequences are related, put them next to each other on the tree Fig..3 Page 37 haracter-based tree: identify positions that best describe how characters (amino acids) are derived from common ancestors Fig..3 Page 37 Stage 4: Tree-building methods Regardless of whether you use distance- or character-based methods for building a tree, the starting point is a multiple sequence alignment. ReadSeq is a convenient web-based program that translates multiple sequence alignments into formats compatible with most commonly used phylogeny programs such as PUP and PHYLIP. Page 378 This site lists 00 phylogeny packages. Perhaps the bestknown programs are PUP (avid Swofford and colleagues) and PHYLIP (Joe Felsenstein). ReadSeq is widely available; try the tools menu at the LNL HIV database Stage 4: Tree-building methods [] distance-based [] character-based: maximum parsimony [3] character- and model-based: maximum likelihood [4] character- and model-based: ayesian

17 Stage 4: Tree-building methods: distance Many software packages are available for making phylogenetic trees. Stage 4: Tree-building methods: distance Many software packages are available for making phylogenetic trees. We will describe two programs. [] MG (Molecular volutionary Genetics nalysis) by Sudhir Kumar, Koichiro Tamura, and Masatoshi Nei. ownload it from [] Phylogeny nalysis Using Parsimony (PUP), written by avid Swofford. See We will next use MG and PUP to generate trees by the distance-based method UPGM. Page 379 Page 379 How to use MG to make a tree Use of MG for a distance-based tree: UPGM [] nter a multiple sequence alignment (.meg) file [] Under the phylogeny menu, select one of these four methods Neighbor-Joining (NJ) Minimum volution (M) Maximum Parsimony (MP) UPGM lick green boxes to obtain options lick compute to obtain tree Use of MG for a distance-based tree: UPGM Use of MG for a distance-based tree: UPGM variety of styles are available for tree display 7

18 Use of MG for a distance-based tree: UPGM Tree-building methods: UPGM UPGM is unweighted pair group method using arithmetic mean Flipping branches around a node creates an equivalent topology Fig..7 Page 38 Tree-building methods: UPGM Tree-building methods: UPGM Step : compute the pairwise distances of all the proteins. Get ready to put the numbers -5 at the bottom of your new tree. Step : Find the two proteins with the smallest pairwise distance. luster them Fig..7 Page 38 Fig..7 Page 38 Tree-building methods: UPGM Tree-building methods: UPGM Step 3: o it again. Find the next two proteins with the smallest pairwise distance. luster them. Step 4: Keep going. luster Fig..7 Page 38 Fig..7 Page 38 8

19 Tree-building methods: UPGM istance-based methods: UPGM trees Step 4: Last cluster! This is your tree. UPGM is a simple approach for making trees n UPGM tree is always rooted. n assumption of the algorithm is that the molecular clock is constant for sequences in the tree. If there are unequal substitution rates, the tree may be wrong. While UPGM is simple, it is less accurate than the neighbor-joining approach (described next) Fig..7 Page 38 Page 383 Making trees using neighbor-joining Tree-building methods: Neighbor joining The neighbor-joining method of Saitou and Nei (987) Is especially useful for making a tree having a large number of taxa. egin by placing all the taxa in a star-like structure. Next, identify neighbors (e.g. and ) that are most closely related. onnect these neighbors to other OTUs via an internal branch, XY. t each successive stage, minimize the sum of the branch lengths. Page 383 Fig..8 Page 384 Tree-building methods: Neighbor joining Use of MG for a distance-based tree: NJ efine the distance from X to Y by d XY = /(d Y + d Y d ) Neighbor Joining produces a reasonably similar tree as UPGM Fig..8 Page 384 9

20 xample of a neighbor-joining tree: phylogenetic analysis of 3 RPs Stage 4: Tree-building methods We will discuss four tree-building methods: [] distance-based [] character-based: maximum parsimony [3] character- and model-based: maximum likelihood [4] character- and model-based: ayesian Fig..9 Page 385 Tree-building methods: character based Rather than pairwise distances between proteins, evaluate the aligned columns of amino acid residues (characters). Tree-building methods based on characters include maximum parsimony and maximum likelihood. Making trees using character-based methods The main idea of character-based methods is to find the tree with the shortest branch lengths possible. Thus we seek the most parsimonious ( simple ) tree. Identify informative sites. For example, constant characters are not parsimony-informative. onstruct trees, counting the number of changes required to create each tree. For about taxa or fewer, evaluate all possible trees exhaustively; for > taxa perform a heuristic search. Select the shortest tree (or trees). Page 383 Page 383 s an example of tree-building using maximum parsimony, consider these four taxa: G GG G How might they have evolved from a common ancestor such as? Tree-building methods: Maximum parsimony G G GG G G G GG G GG G ost = 3 ost = 4 ost = 4 In maximum parsimony, choose the tree(s) with the lowest cost (shortest branch lengths). Fig..0 Page 385 Fig..0 Page 385 0

21 MG for maximum parsimony (MP) trees MG for maximum parsimony (MP) trees Options include heuristic approaches, and bootstrapping In maximum parsimony, there may be more than one tree having the lowest total branch length. You may compute the consensus best tree. Phylogram (values are proportional to branch lengths) Rectangular phylogram (values are proportional to branch lengths) Fig.. Page 387 Fig.. Page 387 ladogram (values are not proportional to branch lengths) Rectangular cladogram (values are not proportional to branch lengths) Fig.. Page 387 These four trees display the same data in different formats. Fig.. Page 387

22 Stage 4: Tree-building methods We will discuss four tree-building methods: [] distance-based [] character-based: maximum parsimony [3] character- and model-based: maximum likelihood [4] character- and model-based: ayesian Making trees using maximum likelihood Maximum likelihood is an alternative to maximum parsimony. It is computationally intensive. likelihood is calculated for the probability of each residue in an alignment, based upon some model of the substitution process. What are the tree topology and branch lengths that have the greatest likelihood of producing the observed data set? ML is implemented in the TR-PUZZL program, as well as PUP and PHYLIP. Page 38 Maximum likelihood: Tree-Puzzle Maximum likelihood tree () Reconstruct all possible quartets,,,. For myoglobins there are 495 possible quartets. () Puzzling step: begin with one quartet tree. N-4 sequences remain. dd them to the branches systematically, estimating the support for each internal branch. Report a consensus tree. Quartet puzzling Stage 4: Tree-building methods We will discuss four tree-building methods: [] distance-based [] character-based: maximum parsimony [3] character- and model-based: maximum likelihood [4] character- and model-based: ayesian

23 ayesian inference of phylogeny with Mrayes Goals of the lecture alculate: Pr [ Tree ata] = Pr [ ata Tree] x Pr [ Tree ] Pr [ ata ] Pr [ Tree ata ] is the posterior probability distribution of trees. Ideally this involves a summation over all possible trees. In practice, Monte arlo Markov hains (MM) are run to estimate the posterior probability distribution. Introduction to evolution and phylogeny Nomenclature of trees Five stages of molecular phylogeny: [] selecting sequences [] multiple sequence alignment [3] models of substitution [4] tree-building [5] tree evaluation Notably, ayesian approaches require you to specify prior assumptions about the model of evolution. Stage 5: valuating trees The main criteria by which the accuracy of a phylogentic tree is assessed are consistency, efficiency, and robustness. valuation of accuracy can refer to an approach (e.g. UPGM) or to a particular tree. Stage 5: valuating trees: bootstrapping ootstrapping is a commonly used approach to measuring the robustness of a tree topology. Given a branching order, how consistently does an algorithm find that branching order in a randomly permuted version of the original data set? Page 38 Page 388 Stage 5: valuating trees: bootstrapping MG for maximum parsimony (MP) trees ootstrapping is a commonly used approach to measuring the robustness of a tree topology. Given a branching order, how consistently does an algorithm find that branching order in a randomly permuted version of the original data set? To bootstrap, make an artificial dataset obtained by randomly sampling columns from your multiple sequence alignment. Make the dataset the same size as the original. o 00 (to,000) bootstrap replicates. Observe the percent of cases in which the assignment of clades in the original tree is supported by the bootstrap replicates. >70% is considered significant. Page 388 ootstrap values show the percent of times each clade is supported after a large number (n=500) of replicate samplings of the data. 3

24 In % of the bootstrap resamplings, ssrbp and btrbp (pig and cow RP) formed a distinct clade. In 39% of the cases, another protein joined the clade (e.g. ecrbp), or one of these two sequences joined another clade. Fig..4 Page 388 4

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5. Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony

More information

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony ioinformatics -- lecture 9 Phylogenetic trees istance-based tree building Parsimony (,(,(,))) rees can be represented in "parenthesis notation". Each set of parentheses represents a branch-point (bifurcation),

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

Theory of Evolution. Charles Darwin

Theory of Evolution. Charles Darwin Theory of Evolution harles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (8-6) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

More information

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B Microbial Diversity and Assessment (II) Spring, 007 Guangyi Wang, Ph.D. POST03B guangyi@hawaii.edu http://www.soest.hawaii.edu/marinefungi/ocn403webpage.htm General introduction and overview Taxonomy [Greek

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

Theory of Evolution Charles Darwin

Theory of Evolution Charles Darwin Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

What is Phylogenetics

What is Phylogenetics What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)

More information

How to read and make phylogenetic trees Zuzana Starostová

How to read and make phylogenetic trees Zuzana Starostová How to read and make phylogenetic trees Zuzana Starostová How to make phylogenetic trees? Workflow: obtain DNA sequence quality check sequence alignment calculating genetic distances phylogeny estimation

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels

More information

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF EVOLUTION Evolution is a process through which variation in individuals makes it more likely for them to survive and reproduce There are principles to the theory

More information

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D 7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood

More information

A (short) introduction to phylogenetics

A (short) introduction to phylogenetics A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field

More information

Phylogeny Tree Algorithms

Phylogeny Tree Algorithms Phylogeny Tree lgorithms Jianlin heng, PhD School of Electrical Engineering and omputer Science University of entral Florida 2006 Free for academic use. opyright @ Jianlin heng & original sources for some

More information

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016 Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,

More information

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

ELE4120 Bioinformatics Tutorial 8

ELE4120 Bioinformatics Tutorial 8 ELE4120 ioinformatics Tutorial 8 ontent lassifying Organisms Systematics and Speciation Taxonomy and phylogenetics Phenetics versus cladistics Phylogenetic trees iological classification Goal: To develop

More information

Phylogenetics. BIOL 7711 Computational Bioscience

Phylogenetics. BIOL 7711 Computational Bioscience Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium

More information

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

More information

C.DARWIN ( )

C.DARWIN ( ) C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics CS5263 Bioinformatics Guest Lecture Part II Phylogenetics Up to now we have focused on finding similarities, now we start focusing on differences (dissimilarities leading to distance measures). Identifying

More information

Multiple Sequence Alignment. Sequences

Multiple Sequence Alignment. Sequences Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26 Phylogeny Chapter 26 Taxonomy Taxonomy: ordered division of organisms into categories based on a set of characteristics used to assess similarities and differences Carolus Linnaeus developed binomial nomenclature,

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

PHYLOGENY & THE TREE OF LIFE

PHYLOGENY & THE TREE OF LIFE PHYLOGENY & THE TREE OF LIFE PREFACE In this powerpoint we learn how biologists distinguish and categorize the millions of species on earth. Early we looked at the process of evolution here we look at

More information

Seuqence Analysis '17--lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony

Seuqence Analysis '17--lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony Seuqence nalysis '17--lecture 10 Trees types of trees Newick notation UPGM Fitch Margoliash istance vs Parsimony Phyogenetic trees What is a phylogenetic tree? model of evolutionary relationships -- common

More information

1 ATGGGTCTC 2 ATGAGTCTC

1 ATGGGTCTC 2 ATGAGTCTC We need an optimality criterion to choose a best estimate (tree) Other optimality criteria used to choose a best estimate (tree) Parsimony: begins with the assumption that the simplest hypothesis that

More information

Phylogenetics Todd Vision Spring Some applications. Uncultured microbial diversity

Phylogenetics Todd Vision Spring Some applications. Uncultured microbial diversity Phylogenetics Todd Vision Spring 2008 Tree basics Sequence alignment Inferring a phylogeny Neighbor joining Maximum parsimony Maximum likelihood Rooting trees and measuring confidence Software and file

More information

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods

More information

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istance-based vs. character-based methods. istance-based: Ultrametric Trees dditive Trees. haracter-based: Perfect phylogeny

More information

Evolutionary Tree Analysis. Overview

Evolutionary Tree Analysis. Overview CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based

More information

Chapter 16: Reconstructing and Using Phylogenies

Chapter 16: Reconstructing and Using Phylogenies Chapter Review 1. Use the phylogenetic tree shown at the right to complete the following. a. Explain how many clades are indicated: Three: (1) chimpanzee/human, (2) chimpanzee/ human/gorilla, and (3)chimpanzee/human/

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction

More information

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses Anatomy of a tree outgroup: an early branching relative of the interest groups sister taxa: taxa derived from the same recent ancestor polytomy: >2 taxa emerge from a node Anatomy of a tree clade is group

More information

Chapter 26 Phylogeny and the Tree of Life

Chapter 26 Phylogeny and the Tree of Life Chapter 26 Phylogeny and the Tree of Life Biologists estimate that there are about 5 to 100 million species of organisms living on Earth today. Evidence from morphological, biochemical, and gene sequence

More information

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki Phylogene)cs IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, 2016 Joyce Nzioki Phylogenetics The study of evolutionary relatedness of organisms. Derived from two Greek words:» Phle/Phylon: Tribe/Race» Genetikos:

More information

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions

More information

Cladistics and Bioinformatics Questions 2013

Cladistics and Bioinformatics Questions 2013 AP Biology Name Cladistics and Bioinformatics Questions 2013 1. The following table shows the percentage similarity in sequences of nucleotides from a homologous gene derived from five different species

More information

Inferring Molecular Phylogeny

Inferring Molecular Phylogeny Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

Classification and Phylogeny

Classification and Phylogeny Classification and Phylogeny The diversity of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize without a scheme

More information

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057 Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number

More information

7. Tests for selection

7. Tests for selection Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info

More information

Chapter 19: Taxonomy, Systematics, and Phylogeny

Chapter 19: Taxonomy, Systematics, and Phylogeny Chapter 19: Taxonomy, Systematics, and Phylogeny AP Curriculum Alignment Chapter 19 expands on the topics of phylogenies and cladograms, which are important to Big Idea 1. In order for students to understand

More information

Lecture 6 Phylogenetic Inference

Lecture 6 Phylogenetic Inference Lecture 6 Phylogenetic Inference From Darwin s notebook in 1837 Charles Darwin Willi Hennig From The Origin in 1859 Cladistics Phylogenetic inference Willi Hennig, Cladistics 1. Clade, Monophyletic group,

More information

Phylogenetic analyses. Kirsi Kostamo

Phylogenetic analyses. Kirsi Kostamo Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,

More information

Classification and Phylogeny

Classification and Phylogeny Classification and Phylogeny The diversity it of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize without a scheme

More information

Lecture 11 Friday, October 21, 2011

Lecture 11 Friday, October 21, 2011 Lecture 11 Friday, October 21, 2011 Phylogenetic tree (phylogeny) Darwin and classification: In the Origin, Darwin said that descent from a common ancestral species could explain why the Linnaean system

More information

Phylogeny. Properties of Trees. Properties of Trees. Trees represent the order of branching only. Phylogeny: Taxon: a unit of classification

Phylogeny. Properties of Trees. Properties of Trees. Trees represent the order of branching only. Phylogeny: Taxon: a unit of classification Multiple sequence alignment global local Evolutionary tree reconstruction Pairwise sequence alignment (global and local) Substitution matrices Gene Finding Protein structure prediction N structure prediction

More information

Introduction to Bioinformatics Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Dr. rer. nat. Gong Jing Cancer Research Center Medicine School of Shandong University 2012.11.09 1 Chapter 4 Phylogenetic Tree 2 Phylogeny Evidence from morphological ( 形态学的 ), biochemical, and gene sequence

More information

Phylogeny: building the tree of life

Phylogeny: building the tree of life Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan

More information

Name: Class: Date: ID: A

Name: Class: Date: ID: A Class: _ Date: _ Ch 17 Practice test 1. A segment of DNA that stores genetic information is called a(n) a. amino acid. b. gene. c. protein. d. intron. 2. In which of the following processes does change

More information

Estimating Evolutionary Trees. Phylogenetic Methods

Estimating Evolutionary Trees. Phylogenetic Methods Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent

More information

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis 10 December 2012 - Corrections - Exercise 1 Non-vertebrate chordates generally possess 2 homologs, vertebrates 3 or more gene copies; a Drosophila

More information

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral

More information

FUNDAMENTALS OF MOLECULAR EVOLUTION

FUNDAMENTALS OF MOLECULAR EVOLUTION FUNDAMENTALS OF MOLECULAR EVOLUTION Second Edition Dan Graur TELAVIV UNIVERSITY Wen-Hsiung Li UNIVERSITY OF CHICAGO SINAUER ASSOCIATES, INC., Publishers Sunderland, Massachusetts Contents Preface xiii

More information

AP Biology. Cladistics

AP Biology. Cladistics Cladistics Kingdom Summary Review slide Review slide Classification Old 5 Kingdom system Eukaryote Monera, Protists, Plants, Fungi, Animals New 3 Domain system reflects a greater understanding of evolution

More information

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science Phylogeny and Evolution Gina Cannarozzi ETH Zurich Institute of Computational Science History Aristotle (384-322 BC) classified animals. He found that dolphins do not belong to the fish but to the mammals.

More information

Molecular evolution. Joe Felsenstein. GENOME 453, Autumn Molecular evolution p.1/49

Molecular evolution. Joe Felsenstein. GENOME 453, Autumn Molecular evolution p.1/49 Molecular evolution Joe Felsenstein GENOME 453, utumn 2009 Molecular evolution p.1/49 data example for phylogeny inference Five DN sequences, for some gene in an imaginary group of species whose names

More information

Biology 211 (2) Week 1 KEY!

Biology 211 (2) Week 1 KEY! Biology 211 (2) Week 1 KEY Chapter 1 KEY FIGURES: 1.2, 1.3, 1.4, 1.5, 1.6, 1.7 VOCABULARY: Adaptation: a trait that increases the fitness Cells: a developed, system bound with a thin outer layer made of

More information

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley

PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION Integrative Biology 200B Spring 2009 University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley B.D. Mishler Jan. 22, 2009. Trees I. Summary of previous lecture: Hennigian

More information

Phylogeny and the Tree of Life

Phylogeny and the Tree of Life Chapter 26 Phylogeny and the Tree of Life PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Concepts and Methods in Molecular Divergence Time Estimation

Concepts and Methods in Molecular Divergence Time Estimation Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks

More information

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço jcarrico@fm.ul.pt Charles Darwin (1809-1882) Charles Darwin s tree of life in Notebook B, 1837-1838 Ernst Haeckel (1934-1919)

More information

Phylogeny: traditional and Bayesian approaches

Phylogeny: traditional and Bayesian approaches Phylogeny: traditional and Bayesian approaches 5-Feb-2014 DEKM book Notes from Dr. B. John Holder and Lewis, Nature Reviews Genetics 4, 275-284, 2003 1 Phylogeny A graph depicting the ancestor-descendent

More information

Inferring Molecular Phylogeny

Inferring Molecular Phylogeny r. Walter Salzburger The tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 2 1. Molecular Markers Inferring Molecular Phylogeny 3 Immunological comparisons! Nuttall

More information

Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Basic Terminology. Looking at Trees. Basic Terminology.

Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Basic Terminology. Looking at Trees. Basic Terminology. Principles of Phylogeny Reconstruction How do we reconstruct the tree of life? Phylogeny: asic erminology Outline: erminology Phylogenetic tree: Methods Problems parsimony maximum likelihood bootstrapping

More information

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees 1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

More information

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley

Integrative Biology 200 PRINCIPLES OF PHYLOGENETICS Spring 2018 University of California, Berkeley Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley B.D. Mishler Feb. 14, 2018. Phylogenetic trees VI: Dating in the 21st century: clocks, & calibrations;

More information

Michael Yaffe Lecture #4 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Michael Yaffe Lecture #4 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D 7.91 Lecture #4 Database Searching & Molecular Phylogenetics Michael Yaffe A B C D A B C D (((A,B)C)D) Outline FASTA, Blast searching, Smith-Waterman Psi-Blast Review of enomic DNA structure Substitution

More information

Phylogeny. November 7, 2017

Phylogeny. November 7, 2017 Phylogeny November 7, 2017 Phylogenetics Phylon = tribe/race, genetikos = relative to birth Phylogenetics: study of evolutionary relationships among organisms, sequences, or anything in between Related

More information

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004, Tracing the Evolution of Numerical Phylogenetics: History, Philosophy, and Significance Adam W. Ferguson Phylogenetic Systematics 26 January 2009 Inferring Phylogenies Historical endeavor Darwin- 1837

More information

Building Phylogenetic Trees UPGMA & NJ

Building Phylogenetic Trees UPGMA & NJ uilding Phylogenetic Trees UPGM & NJ UPGM UPGM Unweighted Pair-Group Method with rithmetic mean Unweighted = all pairwise distances contribute equally. Pair-Group = groups are combined in pairs. rithmetic

More information

Intraspecific gene genealogies: trees grafting into networks

Intraspecific gene genealogies: trees grafting into networks Intraspecific gene genealogies: trees grafting into networks by David Posada & Keith A. Crandall Kessy Abarenkov Tartu, 2004 Article describes: Population genetics principles Intraspecific genetic variation

More information