Supplemental Figure 1.

Similar documents
Comparing Genomes! Homologies and Families! Sequence Alignments!

Biased amino acid composition in warm-blooded animals

Master Biomedizin ) UCSC & UniProt 2) Homology 3) MSA 4) Phylogeny. Pablo Mier

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi)

Supporting Online Material for

Chapter 26: Phylogeny and the Tree of Life

Exploring evolution of brain genes involved in microcephaly through phylogeny and synteny analysis

Hox cluster characterization of Banna caecilian (Ichthyophis bannanicus) provides hints for slow evolution of its genome

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Combination of X-ray crystallography, SAXS and DEER to obtain the structure of the FnIII-3,4 domains of integrin α6β4

Annotation and Nomenclature: A Zebrafish Example. Ingo Braasch, Julian Catchen and John Postlethwait

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

1 ATGGGTCTC 2 ATGAGTCTC

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES

Cladistics and Bioinformatics Questions 2013

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family

Phylogenetic tree construction based on amino acid composition and nucleotide content of complete vertebrate mitochondrial genomes

Evolutionary analysis of the genomic region of FOXP3 in vertebrates

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Name: Class: Date: ID: A

Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene Prediction Errors

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and the Tree of Life

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Letter to the Editor. Temperature Hypotheses. David P. Mindell, Alec Knight,? Christine Baer,$ and Christopher J. Huddlestons

What is Phylogenetics

Fixation of Deleterious Mutations at Critical Positions in Human Proteins

Hands-On Nine The PAX6 Gene and Protein

Cubic Spline Interpolation Reveals Different Evolutionary Trends of Various Species

Multiple Sequence Alignments

Evidence of EVOLUTION

DUPLICATED RNA GENES IN TELEOST FISH GENOMES

Chromosomal Context Affects the Molecular Evolution of Sex-linked Genes and Their Autosomal Counterparts in Turtles and Other Vertebrates

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Chapter 26 Phylogeny and the Tree of Life

microrna Studies Chen-Hanson Ting SVFIG June 23, 2018

CLASSIFICATION OF LIVING THINGS. Chapter 18

mosaic: Supplementary material

molecular evolution and phylogenetics

Evidence for Evolution by Natural Selection Regents Biology

8/23/2014. Phylogeny and the Tree of Life

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

NIH Public Access Author Manuscript Immunogenetics. Author manuscript; available in PMC 2006 May 31.

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

Supplementary Materials for

CS5263 Bioinformatics. Guest Lecture Part II Phylogenetics

Chapter 16: Reconstructing and Using Phylogenies

Quantitative and qualitative analyses. of in-paralogs

Biology Keystone (PA Core) Quiz Theory of Evolution - (BIO.B ) Theory Of Evolution, (BIO.B ) Scientific Terms

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Understanding relationship between homologous sequences

An Evolutionary Trend Discovery Algorithm Based on Cubic Spline Interpolation

Homology and Information Gathering and Domain Annotation for Proteins

The MANTiS Manual. Contents. MANTiS Version 1.1

Camello, a novel family of Histone Acetyltransferases that acetylate histone H4 and is essential for zebrafish development

Phylogenetic analysis. Characters

ELE4120 Bioinformatics Tutorial 8

Major Gene Families in Humans and Their Evolutionary History Prof. Yoshihito Niimura Prof. Masatoshi Nei

Classification and Phylogeny

Origin and evolution of the Ig-like domains present in mammalian leukocyte receptors: insights from chicken, frog, and fish homologues

Visit to BPRC. Data is crucial! Case study: Evolution of AIRE protein 6/7/13

Phylogenetic Reconstruction of Orthology, Paralogy, and Conserved Synteny for Dog and Human

GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny

Phylogeny and the Tree of Life

Classification and Phylogeny

1/27/2010. Systematics and Phylogenetics of the. An Introduction. Taxonomy and Systematics

Evolutionary Paths of the camp-dependent Protein Kinase (PKA) Catalytic Subunits

Classification, Phylogeny yand Evolutionary History

CHAPTER 2: BIOLOGICAL EVOLUTION Evidence for Evolution

The African coelacanth genome provides insights into tetrapod evolution

Using Bioinformatics to Study Evolutionary Relationships Instructions

Multiple Sequence Alignment. Sequences

Phylogeny is the evolutionary history of a group of organisms. Based on the idea that organisms are related by evolution

Phylogeny. Properties of Trees. Properties of Trees. Trees represent the order of branching only. Phylogeny: Taxon: a unit of classification

BIOLOGY. Phylogeny and the Tree of Life CAMPBELL. Reece Urry Cain Wasserman Minorsky Jackson

Introduction to characters and parsimony analysis

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Molecular Coevolution of the Vertebrate Cytochrome c 1 and Rieske Iron Sulfur Protein in the Cytochrome bc 1 Complex

Lecture 6 Phylogenetic Inference

Homology. and. Information Gathering and Domain Annotation for Proteins

Phylogeny and Systematics

Example of Function Prediction

Mechanisms of Evolution Darwinian Evolution

David Lagman 1, Daniel Ocampo Daza 1, Jenny Widmark 1,XesúsMAbalo 1, Görel Sundström 1,2 and Dan Larhammar 1*

Introduction to Molecular Phylogeny

Principles of Long Noncoding RNA Evolution Derived from Direct Comparison of Transcriptomes in 17 Species

PHYLOGENY & THE TREE OF LIFE

Phylogeny CAMPBELL BIOLOGY IN FOCUS SECOND EDITION URRY CAIN WASSERMAN MINORSKY REECE

By submitting this essay, I attest that it is my own work, completed in accordance with University regulations. Jared Shenson

Evidence of Evolution

Loss of Egg Yolk Genes in Mammals and the Origin of Lactation and Placentation

Phylogeny & Systematics: The Tree of Life

Looking for the bird Kiss: evolutionary scenario in sauropsids

Name. Ecology & Evolutionary Biology 2245/2245W Exam 2 1 March 2014

EVOLUTIONARY DISTANCES

Visualizing Phylogenetic Relationships

Phylogenetic Analyses Reveal Ancient Duplication of Estrogen Receptor Isoforms

Transcription:

Supplemental Material: Annu. Rev. Genet. 2015. 49:213 42 doi: 10.1146/annurev-genet-120213-092023 A Uniform System for the Annotation of Vertebrate microrna Genes and the Evolution of the Human micrornaome Fromm et al. Supplemental Figure 1. Single nucleotide substitutions change the name of a sequence and hence obfuscate evolutionary understanding. A. Shown are alignments of six let-7 sequences from human and mouse, aligned to the human let-7a-1 sequence. Dashes indicate identity with this human sequence. Note that the human (hsa) let-7c sequence and the mouse (mmu) let-7c-1 sequence (orange) as well as the mmu-let-7c-2 sequence (blue) have a G in position 19 (bold). However, the mouse let-7c-2 gene phylogenetically groups with the human let-7a-3 sequence (B, blue) and sits in the same relative genomic position, between the let-7b gene (C, red) and the protein-coding gene Wnt7b. Hence, orthologous genes are given different names in these two closely related taxa due to a single convergent nucleotide substitution in the mouse sequence relative to the human sequence.

Supplemental Figure 2. Differential loss can create confusion recognizing orthologues from paralogues. A. The mir-130 family primitively (at least for bony fish) consists of eight genes, four linked in a single cluster, three in another, and one isolated gene. This is inferred from the genomic organizations of these genes in human, chicken, zebrafish and coelacanth. All of these genes are present in the coelacanth, but differential loss of some of these genes in the human, chicken and zebrafish lineages results in incomplete clusters of genes, with the human mir-454 gene associated with the a cluster and the chicken mir-454 gene associated with the b cluster. Coelacanth genes are named according to the nomenclature proposed herein (see Fig. 4, Supp. Fig. 3, and Supp. File 3, as well as the text). B. A mid-point rooted phylogenetic (distance) analysis supporting the scenario outlined in A. Because of the assumed homology between the two mir-454 genes, the chicken b cluster is given the same names as the genes found in the human a cluster, but these are all paralogues of one another, not orthologues, and thus the human mir- 454 gene is paralogous, not orthologous, to the chicken mir-454 gene. C. Alignment of the mir-130 mature sequences showing the identity of the seed region between the mir-130, mir-301 and mir-454 sequences, in addition to other regions of similarity among the three sets of genes.

Supplemental Figure 3. The nomenclature proposal and methods employed within the manuscript. A. The proposed nomenclature system for the mir-204 family, highlighting the advantages of the new system and the disadvantages of the old. Shown on the left is a phylogenetic tree of all of the mir-204 relatives in 21 osteichthyian taxa. Genes already deposited in mirbase are named according to the mirbase entry (e.g., hsa-mir-204); those that are not yet released are given the proposed name herein (e.g., Mdo-Mir-204-P1). Note that there are three clear Mir- 204 paralogues, here named P1, P2 and P3. Mir-204-P1 is present in all 21 taxa, and in all taxa with a sequenced genome this gene is located in an intron of the TRPM3 gene (or associated with the gene where exon/intron boundaries are not defined, as in the turtle Pelodiscus sinensis) (orange). The Mir-204-P2 gene is also present in all 21 taxa, and is located in (or associated with) an intron in the TRPM1 gene (purple). Note though that there are two copies of this gene in zebrafish, and both are associated with paralogues of the trpm1 gene (top); one of these genes is novel in that no mirbase entry is recorded for this gene, but an identifier is present in Ensembl. The third Mir-204 paralogue Mir-204-P3 is absent in mammals, zebrafish, and Xenopus, but present in all of the other taxa and in both birds and turtles is linked to the ABHD17A (green) and ATB8B3 (magenta) genes. Thus, both the phylogenetic and the syntenic analyses support the proposed nomenclature. Importantly, the eutherian mir-211 sequence belongs to the Mir-204-P2 subgroup (which incudes the only Mir-204 gene in the platypus [oan-mir-204] and opossum [mdo-mir-204] currently deposited in mirbase), whereas the chicken mir-211 sequence belongs to the Mir-204-P3 subgroup. Furthermore, given that the platypus and opossum mir-204 sequences are members of the P2 subclade, there are likely genes in one or both taxa related to the P1 subgroup, and indeed both of these genes are present in their respective genomes, but are not yet deposited in mirbase. Species abbreviations here and in all other figures and tables are as follows: Aca - Anolis carolinensis (lizard); Ami - Alligator mississippiensis (American alligator); Asi - Alligator sinensis (Chinese alligator); Asp - Apalone spinifera (spiny softshell turtle); Cli - Columba livia (pigeon); Cmy - Chelonia mydas (sea turtle); Cpi - Chrysemys picta (painted turtle); Dre - Danio rerio (zebrafish); Gga - Gallus gallus (chicken); Hsa - Homo sapiens (human); Lch - Latimeria chalumnae (coelacanth); Meu - Macropus eugenii (Tammar wallaby); Mdo - Monodelphis domestica (opossum); Mml Macaca mulatta (rhesus macaque); Mmu - Mus musculus (mouse); Oan - Ornithorhynchus anatinus (platypus); Pbi - Python bivittatus (snake); Psi - Pelodiscus sinensis (Chinese softshell turtle); Rno Rattus norvegicus (rat); Tgu - Taenopygia guttata (zebrafinch); Xtr - Xenopus tropicalis (frog). B. The procedure used to tabulate the nucleotide substitutions in premirna sequences. Shown are the alignments (using ClustalW, Macvector v. 10.0.2) of both the mature (red) and star (blue) reads from all Mir-204 genes from all 21 considered taxa. The numbers in parentheses to the right of the mature read indicated the fold difference between mature and star (from MirBase v. 21 and ref. 39). The dashes indicate identify between the species and human (or chicken for paralogue 3), whereas nucleotide substitutions relative to each of the human (or chicken) paralogues are shown as distinct nucleotides. C. A phylogenetic reconstruction of each of these nucleotide substitutions in the context of a known phylogenetic tree (39). The mutations to the mature are shown in red, whereas mutations to the star shown are shown in blue. The paralogue number and nucleotide position of each mutation for both the mature and star were calculated using MacClade (v. 4.08). For changes at the root of the tree, paralogues and/or outgroups comparisons were used. For example, in Mir- 204-P1, there is a change in the star at position 10 such that zebrafish (Dre) possesses a T whereas all of the other taxa possess a G. Because the G is present in this position in the other two paralogues (P2, and P3) the G is reconstructed as the primitive nucleotide for this position rather than the T. C. These mutations are then plotted by position as a frequency histogram for both the mature (red) and star (blue) sequences. This was done for each of the 234 genes reconstructed as present in the last common ancestor of tetrapods (see Supp. File 5) and the results are presented in Fig. 6. (See Figure 3 on following page)

Supplemental Figure 4. Nucleotide profiles of mature and star sequences of 234 genes present in the last common ancestor of tetrapods separated into 5p versus 3p arms. Note that although there are a few interesting differences between the two arms, the major biases shown in Figure 6 are seen in both the 5p and 3p arms and hence these biases are arm independent.