The use of molecular tools for taxonomic research in zoology & botany

Size: px
Start display at page:

Download "The use of molecular tools for taxonomic research in zoology & botany"

Transcription

1 The use of molecular tools for taxonomic research in zoology & botany

2 Outline Why employ molecular genetic markers? Brief historical overview of DN research Molecular techniques for genetic analysis DN sequence analysis Collect data Retrieve homologous sequences Multiple sequence alignment DN sequence alignment Terminology phylogenetic trees Phylogenetic inference phylogenetic inference

3 Why employ Molecular Genetic Markers Systematics: the biological discipline that is devoted to characterizing the diversity of life and organizing our knowledge about this diversity Tools Morphology Physiology Behaviour Embryology Other organismal characteristics Genomic information Carolus von Linnaeus ( ) Swedish scientist who laid the foundation for modern taxonomy

4 Why employ Molecular Genetic Markers Genomic information - Human genome bp (3 billion bp) genes 1.5 % coding for proteins Fungi, plants, animals 10 million bp 200 billion bp Bacterial genomes 0.5 million bp 10 million bp Protists 20 million bp 500 billion bp

5 Why employ Molecular Genetic Markers Levels of genetic variation Randomly drawn pairs of homologues DN sequences from the human gene pool differ typically at about 0.1% of nucleotide positions Two random human genomes differ approximately at 3 million nucleotide positions Most other species display higher levels of nucleotide diversity

6 Why NOT employ Molecular Genetic Markers Molecular Laboratory Trained staff Genetic analysis Data analysis Cost

7 Historical overview 1944: experimental evidence that DN is genetic material 1953: Watson and Crick propose a molecular model for DN structure 1966: Margoliash determines amino acid sequence of cytochrome c in several taxa and generates the first phylogenetic tree 1968: Kimura proposes the neutral theory of molecular evolution 1977: Maxam & Gilbert and Sanger et al describe laboratory methods for DN sequencing 1979: vise et al and Brown et al introduce mtdn approaches to study natural populations 1981: Palmer et al initiate the use cpdn for molecular phylogenetic reconstruction in plants

8 Historical overview 1985: Saiki and Mullis et al report the enzymatic in vitro amplification of DN via the polymerase chain reaction (PCR) 1989: Kocher et al discover conserved PCR-primers to amplify mtdn fragments from many species (insert picture mtdn) 2001: Publication of draft sequence of the human genome by Lander et al and Venter et al 2005: Margulies et al developed a high-throughput parallel sequence technology for sequencing full genomes (454 sequencing) May 31, Life Sciences Corporation, in collaboration with scientists at the Human Genome Sequencing Center, Baylor College of Medicine, announced today in Houston, Texas, the completion of a project to sequence the genome of James D. Watson, Ph.D., co-discoverer of the double-helix structure of DN. The mapping of Dr. Watson s genome was completed using the Genome Sequencer FLX system and marks the first individual genome to be sequenced for less than $1 million. When we began the Human Genome Project, we anticipated it would take 15 years to sequence the 3 billion base pairs and identify all the genes, said Richard Gibbs, Ph.D., director, Human Genome Sequencing Center, Baylor College of Medicine. We completed it in 13 years in 2003 coinciding with the 50th anniversary of the publication of the work of Watson and Dr. Francis Crick that described the double helix. Today, we give James Watson a DVD containing his personal genome a project completed in only two months. It demonstrates how far sequencing technology has come in a short time.

9 Historical overview Interactive Timelines: HUGO (

10 Molecular techniques Protein immunology (since 1904) Immunological distance between taxa First method used for phylogenetics Protein electrophoresis (mid-1960s) Starch-gel electrophoresis (SGE) llozyme polymorphisms

11 Molecular techniques DN technology: DN-DN hybridization Yields mean genetic differences across a large fraction of any two genomes Source of phylogenetic information ( DN-DN hybridizations on 1700 avian species) Restriction analysis Discovery of restriction endonucleases (1968) Cleave duplex DN at particular oligonucleotide sequences (EcoRI 5 -GTTC-3 ) RFLP: Restriction Fragment Length Polymorphism

12 Molecular techniques DN technology: RPD Randomly mplified Polymorphic DN FLP mplified Fragment-Length Polymorphism SSCP Single-Strand Conformational Polymorphism SINE Short Interspersed Elements STR Short Tandem Repeat (microsatellites) SNP Single Nucleotide Polymorphism DN sequencing

13

14

15

16 DN sequencing

17 Outline Why employ molecular genetic markers? Brief historical overview of DN research Molecular techniques for genetic analysis DN sequence analysis Collect data Retrieve homologous sequences Multiple sequence alignment DN sequence alignment Terminology phylogenetic trees Phylogenetic inference phylogenetic inference

18 DN sequence alignment GCGGCCC TCGGTGTT GGTGG GCGGCCC TCGGTGTT GGTGG GCGTTCC TCGCTGGTT GGTGG GCGTCCC TCGCTGTT GGTGG GCGGCGC TTGCTGTT GGTG ******** ********** ***** TTGCTG CCGGGG--- CCG TTGCTG CCGGTG--GT GCC TTGCTG -CTGG--- CGCG TTGCTG -CTGGGC CGCG TTGCTC -CTCTG--- CGCG ********?????????? *****

19 What is a Multiple lignment? n lignment is an hypothesis of positional homology between nucleotide bases / mino cids. GCGGCCC TCGGTGTT GGTGG GCGGCCC TCGGTGTT GGTGG GCGTTCC TCGCTGGTT GGTGG GCGTCCC TCGCTGTT GGTGG GCGGCGC TTGCTGTT GGTG ******** ********** ***** TTGCTG CCGGGG--- CCG TTGCTG CCGGTG--GT GCC TTGCTG -CTGG--- CGCG TTGCTG -CTGGGC CGCG TTGCTC -CTCTG--- CGCG ********?????????? *****

20 Multiple Sequence lignment- Methods Manual utomatic Combined

21 Overview of ClustalW procedure Hbb_Human 1 - Hbb_Horse Hba_Human Hba_Horse Myg_Whale Quick pairwise alignment: calculate distance matrix Hbb_Human Hbb_Horse Hba_Human Hba_Horse Myg_Whale Neighbor-joining tree (guide tree) 1 PEEKSVTLWGKVN--VDEVGG 2 GEEKVLLWDKVN--EEEVGG 3 PDKTNVKWGKVGHGEYG 4 DKTNVKWSKVGGHGEYG 5 EHEWQLVLHVWKVEDVGHGQ Progressive alignment following guide tree

22 ClustalW- First pair lign the two most closely-related sequences first. This alignment is then fixed and will never change. If a gap is to be introduced subsequently, then it will be introduced in the same place in both sequences, but their relative alignment remains unchanged.

23 ClustalW- Decision time Next consult the guide tree to see what alignment is performed next. It can either be two different sequences that are aligned together or a third sequence can be aligned to the first two. Hbb_Human Hbb_Horse Hba_Human Hba_Horse Myg_Whale

24 ClustalW- lternative 1 If the situation arises where a third sequence is aligned to the first two, then when a gap has to be introduced to improve the alignment, each of these two entities are treated as two single sequences.

25 ClustalW- lternative 2 If, on the other hand, two separate sequences have to be aligned together, then the first pairwise alignment is placed to one side and the pairwise alignment of the other two is carried out.

26 ClustalW- Progression The alignment is progressively built up in this way, with each step being treated as a pairwise alignment, sometimes with each member of a pair having more than one sequence.

27 ClustalW-Good points/bad points dvantages: Speed. Disadvantages: No objective function. No way of quantifying whether or not the alignment is good No way of knowing if the alignment is correct.

28 ClustalW- User-supplied values Two penalties are set by the user (there are default values, but you should know that it is possible to change these). GOP- Gap Opening Penalty is the cost of opening a gap in an alignment. GEP- Gap Extension Penalty is the cost of extending this gap.

29 dvice on alignments Treat cautiously Can be improved by eye (usually) Often helps to have colour-coding. Depending on the use, the user should be able to make a judgement on those regions that are reliable or not. For phylogeny reconstruction, only use those positions whose hypothesis of positional homology is unimpeachable

30 Outline Why employ molecular genetic markers? Brief historical overview of DN research Molecular techniques for genetic analysis DN sequence analysis Collect data Retrieve homologous sequences Multiple sequence alignment DN sequence alignment Terminology phylogenetic trees Phylogenetic inference phylogenetic inference

31 Terminology tree is a mathematical structure that is used to model the actual evolutionary history of a group of sequences or organisms. Represents phylogenetic relationship between organisms or genes, consists of nodes connected by braches: Terminal nodes, leaves, OTUs (Operational Taxonomic Units) or terminal taxa. Internal nodes represent hypothetical ancestors

32 Terminology Types of phylogenetic trees Cladogram: shows relative recency of common ancestry dditive trees (or metric or phylograms): contains additional information, namely branch lengths, which correspond to the amount of evolutionary change. Ultra metric trees (or dendrograms): special kind of additive tree in which the tips are all equidistant from the root

33 Terminology Rooted versus Unrooted trees Rooted tree: root node direction = evolutionary time. Unrooted tree: specifies relationship between OTUs does not define the evolutionary path.

34 Terminology

35 Terminology Homoplasy Parallel evolution Convergent evolution Secondary loss Similarity: ny 2 sequences can be compared and the similarity computed (% nucleotide identity). llowing gaps, 2 non-homologous nt sequences can have a similarity of up to 50%; for aa sequence this can be up to 20%.

36 Outline Why employ molecular genetic markers? Brief historical overview of DN research Molecular techniques for genetic analysis DN sequence analysis Collect data Retrieve homologous sequences Multiple sequence alignment DN sequence alignment Terminology phylogenetic trees Phylogenetic inference phylogenetic inference

37 Phylogenetic Inference Commonly used methods are usually classified into four major groups: parsimony methods distance methods likelihood methods Bayesian methods

38 Phylogenetic Inference

39 Cluster methods vs. search methods Cluster methods use an algorithm (set of steps) to generate a tree. easy to implement computationally efficient produce a single tree tree depends upon the order in which we add sequences to the tree Search methods use some sort of optimality criteria to choose among the set of all possible trees. The optimality criteria gives each tree a score that is based on the comparison of the tree to data dvantage: search methods use an explicit function relating the trees to the data Disadvantage: computationally very expensive (NP complete problem).

40 Maximum Parsimony ims to find the tree topology that can be explained with the smallest number of character changes The most parsimonous or most simple explanation is evolutionary also the most likely one Given a set of characters, such as aligned sequences, parsimony analysis works by determining the fit (number of steps) of each character on a given tree The sum over all characters is called Tree Length Most parsimonious trees (MPTs) have the minimum tree length needed to explain the observed characters Evaluation of the tree length for all possible topologies

41 Maximum Parsimony Site seq 1 T T T seq 2 T C G T seq 3 G C G T seq 4 G C C G T G Site Tree Total ((1,2),(3,4)) ((1,3),(2,4)) ((1,4),(2,3))

42 Maximum Parsimony Results: One or more most parsimonious trees Hypotheses of character evolution associated with each tree (where and how changes have occurred) Branch lengths (amounts of change associated with branches) Various tree and character statistics describing the fit between tree and data

43 Maximum Parsimony dvantages: is a simple method - easily understood operation does not seem to depend on an explicit model of evolution gives trees and associated hypotheses of character evolution reliable results if the data is well structured and homoplasy is either rare or widely (randomly) distributed on the tree Disadvantages May give misleading results if homoplasy is common Underestimates branch lengths Model of evolution is implicit - behaviour of method not well understood

44 Distance methods

45 Distance methods Distance estimates attempt to estimate the mean number of changes per site since 2 taxa last shared a common ancestor During evolution, multiple hits can have happened at a single position: the evolutionary distance is almost always larger than the dissimilarity (% nucleotide divergence) Sequence difference Correction Expected difference based on number of mutations that happened Observed difference Time/Evolutionary distance

46 Distance methods Computation of evolutionary distances T C G T C G G T T C G T C C G T T G C T C G T T C T C G G C C C G dissimilarity Convert dissimilarity to evolutionary distance by correcting for multiple events per site according to a certain model of evolution evolutionary distance

47 Distance methods model of evolution PURINES α G α α α α PYRIMIDINES C α T ll substitution rates are equal (α)

48 Distance methods 4 possible transitions: G C T 8 possible transversions: C T G C G T Thus if mutations were random, transversions are 2 times more likely than transitions. Due to steric hindrance and chemical properties, the opposite is true, transitions occur in general 2 times more often. Transversions result in more disruptive amino acid changes

49 Distance methods model of evolution PURINES α G β β β β PYRIMIDINES C T Rate for transitions (α) is different from transversions (β) α

50 Distance methods Nucleotide substitution models Jukes-Cantor (JC) model Equal base frequencies ll substitutions equally likely llow for transition/ transversion bias llow base frequencies to vary Felsenstein (F81) model Unequal base frequencies ll substitutions equally likely llow for transition/ transversion bias Kimura 2 parameter (K2P) model Equal base frequencies Transversions and transitions have different substitution rates llow base frequencies to vary Hasegawa et al. (HKY85) Unequal base frequencies Transversions and transitions have different substitution rates llow all six pairs of substitutions to have different rates General reversible (GTR) Unequal base frequencies ll six pairs of substitutions have different rates

51 Distance methods dvantages: Fast - suitable for analysing data sets which are too large for ML large number of models are available with many parameters - improves estimation of distances Disadvantages: Information is lost - given only the distances it is impossible to derive the original sequences Only through character based analyses (ML, parsimony) can the most informative positions be inferred Generally outperformed by Maximum likelihood methods in choosing the correct tree in computer simulations

52 Maximum likelihood methods Maximum likelihood methods of phylogenetic inference evaluate a hypothesis about evolutionary history (the branching order and branch lengths of a tree) in terms of a probability that a proposed model of the evolutionary process and the hypothesised history (tree) would give rise to the data we observe The likelihood of observing a given set of sequence data for a specific substitution model is maximized for each topology and the topology that gives the highest maximum likelihood is chosen as the final tree. The method requires a probabilistic model for the process of nucleotide substitutions. Maximum likelihood methods of tree building must solve two problems: For a given topology, what set of branch lengths makes the observed data most likely (what is the maximum likelihood value for that tree)? Which tree of all the possible trees has the greatest likelihood?

53 Maximum likelihood methods set of aligned nucleotide sequences for four OTU s What is the probability that this tree could have generated the data under our chosen model of evolution. Under the assumption that nucleotide sites evolve independently, we can calculate we can calculate the likelihood for each site separately, and combine the likelihoods into a total value. To calculate the likelihood for some site j consider all possible scenarios there are 16 possibilities to consider. Having calculated the likelihoods at each site, the joint probability that the tree and model confer upon all sites is computed as the product of the individual site likelihoods Because the probability of any single observation is an extremely small number, we almost always evaluate the log of the likelihood instead, so the probabilities are accumulated as the sum of the logs of the single site likelihoods.

54 Maximum likelihood methods dvantages: Mathematically rigorous & performs well in computer simulations llows investigation of the fit between model and data Provides a simple way of comparing trees according to their likelihoods (difference tests - Kishino Hasegawa Test) Disadvantages: Maximum likelihood will only be consistent (converge on the true tree) if evolution proceeds according to the assumed model: How well does the model fit the data? Becomes impossible computationally if many taxa or many model parameters

55 Choosing Models Models can be made more parameter rich to increase their realism: But the more parameters you estimate from the data the more time needed for an analysis and the more sampling error accumulates One might have a realistic model but large sampling errors Realism comes at a cost in time and precision! Fewer parameters may give an inaccurate estimate, but more parameters decrease the precision of the estimate In general use the simplest model which fits the data Compare nested models incorporating additional parameters for their likelihoods

56 Cluster methods vs. search methods Cluster methods use an algorithm (set of steps) to generate a tree. easy to implement computationally efficient produce a single tree tree depends upon the order in which we add sequences to the tree Search methods use some sort of optimality criteria to choose among the set of all possible trees. The optimality criteria gives each tree a score that is based on the comparison of the tree to data dvantage: search methods use an explicit function relating the trees to the data Disadvantage: computationally very expensive (NP complete problem).

57 Cluster methods UPGM Unweighted pair group method with arithmetic means Clustering is done by searching for the smallest distance in pairwise distance matrix Only one tree is obtained Neighbour-joining The NJ algorithm uses as branch length criterion a corrected average of an OTU with all other OTUs: unequal branch length are allowed Only one tree is obtained

58 Cluster methods UPGM Suppose a matrix of pairwise distances B C D E B C D E F B Compute new distances between (B) and other OTUs d (B)C = (d C + d BC )/2 = 4 d (B)D = (d D + d BD )/2 = 6 d (B)E = (d E + d BE )/2 = 6 d (B)F = (d F + d BF )/2 = 8

59 Clustering methods UPGM (B) C D E C D E F D E Compute new distances between (DE) and other OTUs d (DE)(B) = (d D(B) + d E(B) )/2 = 6 d (DE)C = (d DC + d EC )/2 = 6 d (DE)F = (d DF + d EF )/2 = 8

60 Clustering methods UPGM C (DE) (B) C (DE) B F C Compute new distances between (BC) and other OTUs d (BC)(DE) = (d (B)(DE) + d C(DE) )/2 = 6 d (BC)F = (d (B)F + d CF )/2 = 8

61 Clustering methods UPGM (BC) (DE) B (DE) 6 2 C F D E Compute new distances between (BCDE) and OTU F d (BCDE)F = (d (BC)F + d (DE)F )/2 = 8

62 Clustering methods UPGM B (BC),(DE) F C 2 D 2 E 4 F

63 search methods Exhaustive search: guaranteed to find the minimum tree because all tree topologies are evaluated. Not possible for more than ±10 sequences Branch and bound: guaranteed to find the minimum tree without evaluating all tree topologies: a larger number of taxa can be evaluated but still limited (depends on the dataset) Heuristic searches: not guaranteed to find the minimal tree Uses stepwise addition of taxa and rearrangement process (branch swapping)

64 search methods B C D C B D B C D B C B C D B C D B C D B C D B C D C B D C B D C B D C B D C B D D B C D B C D B C D B C D B C E E E E E E E E E E E E E E E ( ) N n n U n = ( )!!

65 Branch and bound search methods C B B B 13 subst D C 16 subst D D 17 subst C E E E C E 16 substitutions with 4 taxa 17 substitutions with 4 taxa B E C D Do not retain topologies with more substitutions than encountered in a next step: Only 5 topologies have to be investigated instead of 15! B 15 subst D 15 substitutions with 5 taxa --- Introductory seminar on the E use of molecular tools in natural history collections November 2007, RMC ---

66 search methods Heuristic Start with stepwise addition Perform branch swapping e.g. Tree Bisection Reconnection (TBR) B C C D D G E E F B D C B G G E F F B F C D G E

67 search methods Heuristic

68 Bootstrapping

69 Bayesian phylogenetics Prior probability Pr[Tree i] : Probability of tree before observations have been made Likelihood Pr[Data Tree i] Proportional to the probability of the observations (=alignment) Requires specific assumptions about the process generating the observations (=parameters evolutionary model) Posterior probability Pr[Tree i Data] : The probability of the tree conditional on the observations (=alignment) Obtained by combining prior & likelihood for each tree using Bayes formula:

70 Bayesian phylogenetics The optimal tree is the one that maximizes the posterior probability Bayesian methods allow complex methods of evolution to be implemented (ML methods have problems when the ratio of data points to parameters is low) Baysian methods rely on an algorithm (MCMC, Markov Chain Monte Carlo) that does not attempt to find the highest point in the space of all parameters Treats parameters in a different way compared to ML methods. (marginal vs joint estimation) Provides support measures (no bootstrapping)

71 Bayesian phylogenetics

72 Summary Holder & Lewis 2003 Nature Reviews Genetics (4)

73 Terminology Gene trees and species trees The divergences of genes is longer than the time of species divergence. Topology of gene tree can be different from the species tree due to lineage sorting depends on long-term effective population size generation time interval between successive speciations When the speciation event occurs every 1 or 2 million years it is unlikely that the species tree differs from the gene tree.

74 Maximum likelihood methods non-biological example: coin tossing If the probability of an event X dependent on model parameters p is written: P ( X p ) then we would talk about the likelihood L ( p X ) that is, the likelihood of the parameters given the data. Likelihood is the hypothetical probability that an event that has already occurred would yield a specific outcome. The concept differs from that of a probability in that a probability refers to the occurrence of future events, while a likelihood refers to past events with known outcomes.

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels

More information

Phylogeny Tree Algorithms

Phylogeny Tree Algorithms Phylogeny Tree lgorithms Jianlin heng, PhD School of Electrical Engineering and omputer Science University of entral Florida 2006 Free for academic use. opyright @ Jianlin heng & original sources for some

More information

Inferring Molecular Phylogeny

Inferring Molecular Phylogeny r. Walter Salzburger The tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 2 1. Molecular Markers Inferring Molecular Phylogeny 3 Immunological comparisons! Nuttall

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

Phylogenetics. BIOL 7711 Computational Bioscience

Phylogenetics. BIOL 7711 Computational Bioscience Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees 1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

More information

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057 Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

What is Phylogenetics

What is Phylogenetics What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)

More information

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary

More information

C.DARWIN ( )

C.DARWIN ( ) C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley

PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION Integrative Biology 200B Spring 2009 University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley B.D. Mishler Jan. 22, 2009. Trees I. Summary of previous lecture: Hennigian

More information

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley. Parsimony & Likelihood [draft]

Integrative Biology 200 PRINCIPLES OF PHYLOGENETICS Spring 2016 University of California, Berkeley. Parsimony & Likelihood [draft] Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley K.W. Will Parsimony & Likelihood [draft] 1. Hennig and Parsimony: Hennig was not concerned with parsimony

More information

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9 Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

PHYLOGENY AND SYSTEMATICS

PHYLOGENY AND SYSTEMATICS AP BIOLOGY EVOLUTION/HEREDITY UNIT Unit 1 Part 11 Chapter 26 Activity #15 NAME DATE PERIOD PHYLOGENY AND SYSTEMATICS PHYLOGENY Evolutionary history of species or group of related species SYSTEMATICS Study

More information

Inferring Molecular Phylogeny

Inferring Molecular Phylogeny Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction

More information

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions

More information

A (short) introduction to phylogenetics

A (short) introduction to phylogenetics A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field

More information

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

More information

Molecular Evolution & Phylogenetics

Molecular Evolution & Phylogenetics Molecular Evolution & Phylogenetics Heuristics based on tree alterations, maximum likelihood, Bayesian methods, statistical confidence measures Jean-Baka Domelevo Entfellner Learning Objectives know basic

More information

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction

More information

Estimating Evolutionary Trees. Phylogenetic Methods

Estimating Evolutionary Trees. Phylogenetic Methods Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent

More information

Phylogenetic methods in molecular systematics

Phylogenetic methods in molecular systematics Phylogenetic methods in molecular systematics Niklas Wahlberg Stockholm University Acknowledgement Many of the slides in this lecture series modified from slides by others www.dbbm.fiocruz.br/james/lectures.html

More information

SCIENTIFIC EVIDENCE TO SUPPORT THE THEORY OF EVOLUTION. Using Anatomy, Embryology, Biochemistry, and Paleontology

SCIENTIFIC EVIDENCE TO SUPPORT THE THEORY OF EVOLUTION. Using Anatomy, Embryology, Biochemistry, and Paleontology SCIENTIFIC EVIDENCE TO SUPPORT THE THEORY OF EVOLUTION Using Anatomy, Embryology, Biochemistry, and Paleontology Scientific Fields Different fields of science have contributed evidence for the theory of

More information

How to read and make phylogenetic trees Zuzana Starostová

How to read and make phylogenetic trees Zuzana Starostová How to read and make phylogenetic trees Zuzana Starostová How to make phylogenetic trees? Workflow: obtain DNA sequence quality check sequence alignment calculating genetic distances phylogeny estimation

More information

Maximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington.

Maximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington. Maximum Likelihood This presentation is based almost entirely on Peter G. Fosters - "The Idiot s Guide to the Zen of Likelihood in a Nutshell in Seven Days for Dummies, Unleashed. http://www.bioinf.org/molsys/data/idiots.pdf

More information

Effects of Gap Open and Gap Extension Penalties

Effects of Gap Open and Gap Extension Penalties Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See

More information

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

More information

Maximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018

Maximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018 Maximum Likelihood Tree Estimation Carrie Tribble IB 200 9 Feb 2018 Outline 1. Tree building process under maximum likelihood 2. Key differences between maximum likelihood and parsimony 3. Some fancy extras

More information

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral

More information

Phylogeny. November 7, 2017

Phylogeny. November 7, 2017 Phylogeny November 7, 2017 Phylogenetics Phylon = tribe/race, genetikos = relative to birth Phylogenetics: study of evolutionary relationships among organisms, sequences, or anything in between Related

More information

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics CS5263 Bioinformatics Guest Lecture Part II Phylogenetics Up to now we have focused on finding similarities, now we start focusing on differences (dissimilarities leading to distance measures). Identifying

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

Lecture 6 Phylogenetic Inference

Lecture 6 Phylogenetic Inference Lecture 6 Phylogenetic Inference From Darwin s notebook in 1837 Charles Darwin Willi Hennig From The Origin in 1859 Cladistics Phylogenetic inference Willi Hennig, Cladistics 1. Clade, Monophyletic group,

More information

Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood

Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood For: Prof. Partensky Group: Jimin zhu Rama Sharma Sravanthi Polsani Xin Gong Shlomit klopman April. 7. 2003 Table of Contents Introduction...3

More information

Phylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches

Phylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches Int. J. Bioinformatics Research and Applications, Vol. x, No. x, xxxx Phylogenies Scores for Exhaustive Maximum Likelihood and s Searches Hyrum D. Carroll, Perry G. Ridge, Mark J. Clement, Quinn O. Snell

More information

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004, Tracing the Evolution of Numerical Phylogenetics: History, Philosophy, and Significance Adam W. Ferguson Phylogenetic Systematics 26 January 2009 Inferring Phylogenies Historical endeavor Darwin- 1837

More information

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them?

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them? Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them? Carolus Linneaus:Systema Naturae (1735) Swedish botanist &

More information

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istance-based vs. character-based methods. istance-based: Ultrametric Trees dditive Trees. haracter-based: Perfect phylogeny

More information

How should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?

How should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe? How should we go about modeling this? gorilla GAAGTCCTTGAGAAATAAACTGCACACACTGG orangutan GGACTCCTTGAGAAATAAACTGCACACACTGG Model parameters? Time Substitution rate Can we observe time or subst. rate? What

More information

Concepts and Methods in Molecular Divergence Time Estimation

Concepts and Methods in Molecular Divergence Time Estimation Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks

More information

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5. Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony

More information

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D 7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood

More information

Phylogeny: building the tree of life

Phylogeny: building the tree of life Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan

More information

How Molecules Evolve. Advantages of Molecular Data for Tree Building. Advantages of Molecular Data for Tree Building

How Molecules Evolve. Advantages of Molecular Data for Tree Building. Advantages of Molecular Data for Tree Building How Molecules Evolve Guest Lecture: Principles and Methods of Systematic Biology 11 November 2013 Chris Simon Approaching phylogenetics from the point of view of the data Understanding how sequences evolve

More information

Warm-Up- Review Natural Selection and Reproduction for quiz today!!!! Notes on Evidence of Evolution Work on Vocabulary and Lab

Warm-Up- Review Natural Selection and Reproduction for quiz today!!!! Notes on Evidence of Evolution Work on Vocabulary and Lab Date: Agenda Warm-Up- Review Natural Selection and Reproduction for quiz today!!!! Notes on Evidence of Evolution Work on Vocabulary and Lab Ask questions based on 5.1 and 5.2 Quiz on 5.1 and 5.2 How

More information

Introduction to characters and parsimony analysis

Introduction to characters and parsimony analysis Introduction to characters and parsimony analysis Genetic Relationships Genetic relationships exist between individuals within populations These include ancestordescendent relationships and more indirect

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive. Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then

More information

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki Phylogene)cs IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, 2016 Joyce Nzioki Phylogenetics The study of evolutionary relatedness of organisms. Derived from two Greek words:» Phle/Phylon: Tribe/Race» Genetikos:

More information

Evolutionary Change in Nucleotide Sequences. Lecture 3

Evolutionary Change in Nucleotide Sequences. Lecture 3 Evolutionary Change in Nucleotide Sequences Lecture 3 1 So far, we described the evolutionary process as a series of gene substitutions in which new alleles, each arising as a mutation ti in a single individual,

More information

Letter to the Editor. Department of Biology, Arizona State University

Letter to the Editor. Department of Biology, Arizona State University Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona

More information

A Phylogenetic Network Construction due to Constrained Recombination

A Phylogenetic Network Construction due to Constrained Recombination A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer

More information

MOLECULAR SYSTEMATICS: A SYNTHESIS OF THE COMMON METHODS AND THE STATE OF KNOWLEDGE

MOLECULAR SYSTEMATICS: A SYNTHESIS OF THE COMMON METHODS AND THE STATE OF KNOWLEDGE CELLULAR & MOLECULAR BIOLOGY LETTERS http://www.cmbl.org.pl Received: 16 August 2009 Volume 15 (2010) pp 311-341 Final form accepted: 01 March 2010 DOI: 10.2478/s11658-010-0010-8 Published online: 19 March

More information

Multiple Sequence Alignment. Sequences

Multiple Sequence Alignment. Sequences Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe

More information

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies 1 What is phylogeny? Essay written for the course in Markov Chains 2004 Torbjörn Karfunkel Phylogeny is the evolutionary development

More information

The practice of naming and classifying organisms is called taxonomy.

The practice of naming and classifying organisms is called taxonomy. Chapter 18 Key Idea: Biologists use taxonomic systems to organize their knowledge of organisms. These systems attempt to provide consistent ways to name and categorize organisms. The practice of naming

More information

Evolutionary Tree Analysis. Overview

Evolutionary Tree Analysis. Overview CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based

More information

Molecular Evolution, course # Final Exam, May 3, 2006

Molecular Evolution, course # Final Exam, May 3, 2006 Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least

More information

How should we organize the diversity of animal life?

How should we organize the diversity of animal life? How should we organize the diversity of animal life? The difference between Taxonomy Linneaus, and Cladistics Darwin What are phylogenies? How do we read them? How do we estimate them? Classification (Taxonomy)

More information

Probabilistic modeling and molecular phylogeny

Probabilistic modeling and molecular phylogeny Probabilistic modeling and molecular phylogeny Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) What is a model? Mathematical

More information

Classification, Phylogeny yand Evolutionary History

Classification, Phylogeny yand Evolutionary History Classification, Phylogeny yand Evolutionary History The diversity of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize

More information

Phylogenetics Todd Vision Spring Some applications. Uncultured microbial diversity

Phylogenetics Todd Vision Spring Some applications. Uncultured microbial diversity Phylogenetics Todd Vision Spring 2008 Tree basics Sequence alignment Inferring a phylogeny Neighbor joining Maximum parsimony Maximum likelihood Rooting trees and measuring confidence Software and file

More information

Week 5: Distance methods, DNA and protein models

Week 5: Distance methods, DNA and protein models Week 5: Distance methods, DNA and protein models Genome 570 February, 2016 Week 5: Distance methods, DNA and protein models p.1/69 A tree and the expected distances it predicts E A 0.08 0.05 0.06 0.03

More information

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) Bioinformática Sequence Alignment Pairwise Sequence Alignment Universidade da Beira Interior (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) 1 16/3/29 & 23/3/29 27/4/29 Outline

More information

Consensus Methods. * You are only responsible for the first two

Consensus Methods. * You are only responsible for the first two Consensus Trees * consensus trees reconcile clades from different trees * consensus is a conservative estimate of phylogeny that emphasizes points of agreement * philosophy: agreement among data sets is

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony ioinformatics -- lecture 9 Phylogenetic trees istance-based tree building Parsimony (,(,(,))) rees can be represented in "parenthesis notation". Each set of parentheses represents a branch-point (bifurcation),

More information

Phylogenetic analyses. Kirsi Kostamo

Phylogenetic analyses. Kirsi Kostamo Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,

More information

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26 Phylogeny Chapter 26 Taxonomy Taxonomy: ordered division of organisms into categories based on a set of characteristics used to assess similarities and differences Carolus Linnaeus developed binomial nomenclature,

More information

Lecture Notes: Markov chains

Lecture Notes: Markov chains Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity

More information

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29):

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29): Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29): Statistical estimation of models of sequence evolution Phylogenetic inference using maximum likelihood:

More information

PHYLOGENY & THE TREE OF LIFE

PHYLOGENY & THE TREE OF LIFE PHYLOGENY & THE TREE OF LIFE PREFACE In this powerpoint we learn how biologists distinguish and categorize the millions of species on earth. Early we looked at the process of evolution here we look at

More information

Classification Systems. - Taxonomy

Classification Systems. - Taxonomy Classification Systems - Taxonomy Why Classify? 2.5 million kinds of organisms Not complete- 20 million organisms estimated Must divide into manageable groups To work with the diversity of life we need

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X

More information

7. Tests for selection

7. Tests for selection Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info

More information

Phylogenetic Analysis

Phylogenetic Analysis Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)

More information

Theory of Evolution. Charles Darwin

Theory of Evolution. Charles Darwin Theory of Evolution harles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (8-6) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties

More information

Systematics - Bio 615

Systematics - Bio 615 Bayesian Phylogenetic Inference 1. Introduction, history 2. Advantages over ML 3. Bayes Rule 4. The Priors 5. Marginal vs Joint estimation 6. MCMC Derek S. Sikes University of Alaska 7. Posteriors vs Bootstrap

More information