Multiple Sequence Alignment. Sequences

Similar documents
Algorithms in Bioinformatics

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Cladistics and Bioinformatics Questions 2013

BINF6201/8201. Molecular phylogenetic methods

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic inference

Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

What is Phylogenetics

Dr. Amira A. AL-Hosary

Phylogeny: building the tree of life

CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Introduction to characters and parsimony analysis

Lecture 11 Friday, October 21, 2011

Phylogenetic Tree Reconstruction

Evolutionary Tree Analysis. Overview

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Phylogenetics: Building Phylogenetic Trees

EVOLUTIONARY DISTANCES

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Phylogenetic Trees. How do the changes in gene sequences allow us to reconstruct the evolutionary relationships between related species?

8/23/2014. Phylogeny and the Tree of Life

Macroevolution Part I: Phylogenies

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses

Phylogenetic Analysis

Organizing Life s Diversity

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley

Theory of Evolution Charles Darwin

Phylogenetic Analysis

Phylogenetic Analysis

Chapter 27: Evolutionary Genetics

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B

How to read and make phylogenetic trees Zuzana Starostová

Phylogeny: traditional and Bayesian approaches

Classification, Phylogeny yand Evolutionary History

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

A (short) introduction to phylogenetics

Phylogenetic trees 07/10/13

Constructing Evolutionary/Phylogenetic Trees

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Reconstructing the history of lineages

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Name: Class: Date: ID: A

The Tree of Life. Phylogeny

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

PHYLOGENY WHAT IS EVOLUTION? 1/22/2018. Change must occur in a population via allele

C.DARWIN ( )

Seuqence Analysis '17--lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony

Introduction to Bioinformatics Introduction to Bioinformatics

Phylogeny & Systematics: The Tree of Life

Constructing Evolutionary Trees

Chapter 19 Organizing Information About Species: Taxonomy and Cladistics

CLASSIFICATION OF LIVING THINGS. Chapter 18

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Biology 211 (2) Week 1 KEY!

Chapter 26 Phylogeny and the Tree of Life

Chapter 19: Taxonomy, Systematics, and Phylogeny

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

The Tree of Life. Chapter 17

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Outline. Classification of Living Things

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Biology 2. Lecture Material. For. Macroevolution. Systematics

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

molecular evolution and phylogenetics

Phylogenetics. BIOL 7711 Computational Bioscience

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

Microbial Taxonomy and the Evolution of Diversity

PHYLOGENY & THE TREE OF LIFE

Phylogenetic inference: from sequences to trees

Modern Evolutionary Classification. Section 18-2 pgs

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Classification and Phylogeny

Unit 5: Taxonomy. KEY CONCEPT Organisms can be classified based on physical similarities.

Comparative Genomics II

Chapter 17. Organizing Life's Diversity

Chapter 26 Phylogeny and the Tree of Life

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

Emily Blanton Phylogeny Lab Report May 2009

Warm-Up- Review Natural Selection and Reproduction for quiz today!!!! Notes on Evidence of Evolution Work on Vocabulary and Lab

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Classification and Phylogeny


OMICS Journals are welcoming Submissions

Genomics and bioinformatics summary. Finding genes -- computer searches

SCIENTIFIC EVIDENCE TO SUPPORT THE THEORY OF EVOLUTION. Using Anatomy, Embryology, Biochemistry, and Paleontology

Theory of Evolution. Charles Darwin

Transcription:

Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe ssvkdlneakvlavgpgaldkdgkrlpmgvnagdrvl ipqyggspvkvgeeeytlfrdseilakiae > nidulans msllrnvknlaplldrvlvqrvkpeaktasgiflpes svkeqneakvlavgpgavdrngqripmgvaagdrvlv pqfggsplkigeeeyhlfrdseilakine > pombe (fission yeast) matklksaksivplldrilvqrikadtktasgiflpe ksveklsegrvisvgkggynkegklaqpsvavgdrvl lpayggsnikvgeeeyslyrdhellaiike > alpina masritkfsktivpmmdrvlvqrikpqqktasgiyip ekaqealnegyvvavgkglttqegkvvpselaegdkv llppyggsvvkvdneelilfreseilakiq > cohnii matgiakrftplldrvlvqrlkpeaktasglflpesa akapnyatvlavgpggrtrdgdilpmnvkvgdkvvvp eyggmtlkfedeefqvfrdadimgilne > melanogaster maaaikkiipmldriliqraealtktkggivlpekav gkvlegtvlavgpgtrnastgnhipigvkegdrvllp efggtkvnlegdqkelflfresdilakle > sapiens agqafrkflplfdrvlversaaetvtkggimlpeksq gkvlqatvvavgsgskgkggeiqpvsvkvgdkvllpe yggtkvvlddkdyflfrdgxilgky > stearothermophilus vlkplgdrvvievieteektasgivlpdtakekpqeg rvvavgkgrvldsgervapevevgdriifskyagtev kydgkeylilresdilavig > tuberculosis makvnikpledkilvqaneaetttasglvipdtakek pqegtvvavgpgrwdedgekripldvaegdtviysky ggteikyngeeylilsardvlavvsk > musculus (house mouse) magqafrkflllfdrvlversaaetvtkggimlpeks qgkvlqatvvavgsggkgksgeiepvsvkvgdkvllp eyggtkvvlddkdyflfrdsdilgkyvn 1

Multiple Sequence Alignment (MSA) Why MSA? Proteins are often related to a larger group (i.e., a family) of proteins Multiple sequence alignment is more sensitive than pairwise alignment for detecting homologs MSAs can elucidate conserved residues, motifs, or other functional regions in a protein MSA is critical for phylogenetic analysis Selection of sequences Multiple sequence alignment of sequences Tree building Tree evaluation 2

Pairwise Alignment 3-sequence Alignment AGA AGT TCC A G A T C C A G T 0 0 0 0 0 5 0 0 0 0 10 4 0 5 4 6 3

Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe ssvkdlneakvlavgpgaldkdgkrlpmgvnagdrvl ipqyggspvkvgeeeytlfrdseilakiae > nidulans msllrnvknlaplldrvlvqrvkpeaktasgiflpes svkeqneakvlavgpgavdrngqripmgvaagdrvlv pqfggsplkigeeeyhlfrdseilakine > pombe (fission yeast) matklksaksivplldrilvqrikadtktasgiflpe ksveklsegrvisvgkggynkegklaqpsvavgdrvl lpayggsnikvgeeeyslyrdhellaiike > alpina masritkfsktivpmmdrvlvqrikpqqktasgiyip ekaqealnegyvvavgkglttqegkvvpselaegdkv llppyggsvvkvdneelilfreseilakiq > cohnii matgiakrftplldrvlvqrlkpeaktasglflpesa akapnyatvlavgpggrtrdgdilpmnvkvgdkvvvp eyggmtlkfedeefqvfrdadimgilne > melanogaster maaaikkiipmldriliqraealtktkggivlpekav gkvlegtvlavgpgtrnastgnhipigvkegdrvllp efggtkvnlegdqkelflfresdilakle > sapiens agqafrkflplfdrvlversaaetvtkggimlpeksq gkvlqatvvavgsgskgkggeiqpvsvkvgdkvllpe yggtkvvlddkdyflfrdgxilgky > stearothermophilus vlkplgdrvvievieteektasgivlpdtakekpqeg rvvavgkgrvldsgervapevevgdriifskyagtev kydgkeylilresdilavig > tuberculosis makvnikpledkilvqaneaetttasglvipdtakek pqegtvvavgpgrwdedgekripldvaegdtviysky ggteikyngeeylilsardvlavvsk > musculus (house mouse) magqafrkflllfdrvlversaaetvtkggimlpeks qgkvlqatvvavgsggkgksgeiepvsvkvgdkvllp eyggtkvvlddkdyflfrdsdilgkyvn Multiple Sequence Alignment 4

Pairwise Alignment Scores Schizoscchrmycs 49 46 78 45 55 54 44 38 37 42 52 41 40 43 46 44 41 39 43 43 48 45 45 40 40 38 39 42 53 55 41 41 40 40 43 46 40 43 38 39 61 43 34 36 45 49 42 36 49 37 32 93 59 38 32 Guide Tree 5

Constructing a Guide Tree Unweighted pair group method with arithmetic mean (UPGMA) Neighbor joining (NJ) Unweighted Pair Group Method with Arithmetic mean (UPGMA) Assume each organism is its own group Repeat the following step Merge together the two closest groups 6

Unweighted Pair Group Method with Arithmetic mean (UPGMA) Unweighted Pair Group Method with Arithmetic mean (UPGMA) 7

Unweighted Pair Group Method with Arithmetic mean (UPGMA) Unweighted Pair Group Method with Arithmetic mean (UPGMA) 8

Unweighted Pair Group Method with Arithmetic mean (UPGMA) Unweighted Pair Group Method with Arithmetic mean (UPGMA) 9

Unweighted Pair Group Method with Arithmetic mean (UPGMA) Unweighted Pair Group Method with Arithmetic mean (UPGMA) 10

Unweighted Pair Group Method with Arithmetic mean (UPGMA) Unweighted Pair Group Method with Arithmetic mean (UPGMA) 11

Unweighted Pair Group Method with Arithmetic mean (UPGMA) Guide Tree 12

Neighbor Joining (NJ) Generate full tree with starlike structure Repeat the following step Connect two closest groups (i.e., neighbors) through a single node Neighbor Joining 13

Neighbor Joining Neighbor Joining 14

Neighbor Joining Neighbor Joining 15

Neighbor Joining Neighbor Joining 16

Neighbor Joining Neighbor Joining 17

Neighbor Joining Multiple Sequence Alignment 18

Multiple Sequence Alignment Multiple Sequence Alignment 19

What can phylogeny do for you? Why do we care about evolution and the evolutionary history of organisms? OR How do we benefit from phylogeny? AND How is bioinformatics related to any of this? What are the goals of phylogeny? All life forms share a common origin and are part of the Tree of Life 1) Deduce correct trees of life for all species 2) Infer or estimate divergence times How can we use phylogenetic analyses? 20

Revolutionalizing the Tree of Life Carl Woese: rrna IDs Archaea as separate branch of Tree of Life Discovering new life forms 21

Developing effective snakebite antivenins Identifying emergent diseases 22

Protecting ecosystems from invasive species Eurasian water milfoil Purple loosetrife Caulerpa taxifolia Common phylogenetic tree terminology Branches or Lineages A B Terminal Nodes operational taxanomic units (OTUs) Ancestral Node or ROOT of the Tree Internal Nodes hypothetical taxanomic units (HTUs) C D E Represent the TAXA (genes, populations, species, etc.) used to infer the phylogeny 23

Phylogenetic trees can be drawn many ways Clade: group with a single common ancestor and its descendents A B C D E A-B-C clade B-C clade D-E clade 24

Phylogenetic trees can be rooted or unrooted A B B C Rooted C D D Unrooted A Specifies evolutionary path Root node is most recent common ancestor of all TUs; specifies time flow Shows degree of kinship Doesn t make assumptions or require knowledge of common ancestor Phylogenetic trees can be scaled or unscaled C B C B D A D A Unscaled Branch length not proportional to number of changes/distance Scaled Branch length proportional to number of changes/distance Cladogram Phylogram 25

Phylogenetic trees diagram evolutionary relationships B C A D No meaning to the spacing between the taxa, or to the order in which they appear from top to bottom. E 1) No scale (cladograms) 2) Proportional to genetic distance (phylograms) 3) Proportional to time (ultrametric trees) Rotating clades: same meanings B C A D E A B C = E D 26

Interpreting phylogenetic trees Is the frog more closely related to the fish or the human? How are phylogenetic trees built? Traditionally: use homologous structures Caveats: - Closely related organisms don t always look similar - Similar looking organisms not always closely related - How do you decide importance of traits? 27

Structural analogy can result from convergent evolution Classification based on traits can be tricky cell number organelles 28

Molecular phylogenetic trees Large molecular data sets: Bioinformatics! Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Result: great improvement on classical phylogenies Caveat: Gene divergence may not correlate with species divergence Molecular phylogenies can be constructed using different elements Nuclear genes Mitochondrial DNA Genome structure Reasonably well conserved, present in common ancestors Usually integrate analyses of multiple different genes 29

Molecular comparisons vs. body plans = = Which species are the closest living relatives of modern humans? Gorillas Chimpanzees Bonobos Orangutans Humans Humans Chimpanzees Bonobos Gorillas Orangutans 15-30 MYA 0 14 MYA 0 Pre-molecular view Great apes (chimpanzees, gorillas and orangutans) formed a clade separate from humans. MitoDNA, most nuclear genes, and DNA hybridization Bonobos and chimpanzees are related more closely to humans than either are to gorillas. 30

What is the closest living relative of whales? Phylogenetic trees are hypotheses How do you construct phylogenetic trees? What computational strategies are used? How do you test the robustness of hypotheses? 31