SUPPLEMENTARY INFORMATION

Similar documents
Supplementary Figure 1 The number of differentially expressed genes for uniparental males (green), uniparental females (yellow), biparental males

SUPPLEMENTARY INFORMATION

BLAST. Varieties of BLAST

Comparing Genomes! Homologies and Families! Sequence Alignments!

Hands-On Nine The PAX6 Gene and Protein

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010

NIH-PA Author Manuscript NIH-PA Author Manuscript NIH-PA Author Manuscript

SUPPLEMENTARY INFORMATION

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

Nature Neuroscience: doi: /nn.2662

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Wnt signaling in planarians: new answers to old questions

Comparative Bioinformatics Midterm II Fall 2004

Basic Local Alignment Search Tool

Supplementary Information

Tiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1

Homology Modeling. Roberto Lins EPFL - summer semester 2005

The MANTiS Manual. Contents. MANTiS Version 1.1

Cryo-EM data collection, refinement and validation statistics

Quantitative Measurement of Genome-wide Protein Domain Co-occurrence of Transcription Factors

BIOINFORMATICS LAB AP BIOLOGY

Genomics and bioinformatics summary. Finding genes -- computer searches

Supplementary Figure 1: Mechanism of Lbx2 action on the Wnt/ -catenin signalling pathway. (a) The Wnt/ -catenin signalling pathway and its

RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES

CGS 5991 (2 Credits) Bioinformatics Tools

Ch. 9 Multiple Sequence Alignment (MSA)

BMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven)

Heuristic Alignment and Searching

Bioinformatics Chapter 1. Introduction

Introduction to protein alignments

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

Pyrobayes: an improved base caller for SNP discovery in pyrosequences

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

SUPPLEMENTARY INFORMATION

Biased amino acid composition in warm-blooded animals

Supplementary text for the section Interactions conserved across species: can one select the conserved interactions?

Introduction to de novo RNA-seq assembly

- conserved in Eukaryotes. - proteins in the cluster have identifiable conserved domains. - human gene should be included in the cluster.

A Browser for Pig Genome Data

Supplementary Figure 1. Nature Genetics: doi: /ng.3848


Dr. Amira A. AL-Hosary

Figure S1: Mitochondrial gene map for Pythium ultimum BR144. Arrows indicate transcriptional orientation, clockwise for the outer row and

Sequence analysis and comparison

Name: Class: Date: ID: A

2 Genome evolution: gene fusion versus gene fission

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Orthologs Detection and Applications

Beta-catenin and axis formation in planarians

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

SUPPLEMENTARY INFORMATION

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny

Bioinformatics and BLAST

The Developmental Transcriptome of the Mosquito Aedes aegypti, an invasive species and major arbovirus vector.

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

Small RNA in rice genome

Phylogenetic Tree Reconstruction

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

SCIENTIFIC EVIDENCE TO SUPPORT THE THEORY OF EVOLUTION. Using Anatomy, Embryology, Biochemistry, and Paleontology

Improving Hox Protein Classification across the Major Model Organisms

Hillis DM Inferring complex phylogenies. Nature 383:

Supplementary Figures

OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy

Comparative RNA-seq analysis of transcriptome dynamics during petal development in Rosa chinensis

Supplemental Data. Perea-Resa et al. Plant Cell. (2012) /tpc

Characterization of innexin gene expression and functional roles of gap-junctional communication in planarian regeneration

Chapter 18 Lecture. Concepts of Genetics. Tenth Edition. Developmental Genetics

17 Non-collinear alignment Motivation A B C A B C A B C A B C D A C. This exposition is based on:

Go ahead, grow a head! A planarian s guide to anterior regeneration

SUPPLEMENTARY INFORMATION

Supplemental Figure 1.

Tools and Algorithms in Bioinformatics

Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution

Phylogenetic inference

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)

SUPPLEMENTARY INFORMATION

Procedure to Create NCBI KOGS

Supplementary Figure 1. Real time in vivo imaging of SG secretion. (a) SGs from Drosophila third instar larvae that express Sgs3-GFP (green) and

Genome Sequencing & DNA Sequence Analysis

Biochemistry 324 Bioinformatics. Pairwise sequence alignment

Neural development its all connected

Computational approaches for functional genomics

Neuron Image Analyzer: Automated and Accurate Extraction of Neuronal. Data from Low Quality Images

Computational methods for predicting protein-protein interactions

Introduction to Bioinformatics Introduction to Bioinformatics

Chapter 26 Phylogeny and the Tree of Life

Bioinformatics Exercises

is a planarian stem cell gap junction gene required for regeneration and homeostasis

Supplemental Figure 1. Comparison of Tiller Bud Formation between the Wild Type and d27. (A) and (B) Longitudinal sections of shoot apex in wild-type

Sequence alignment methods. Pairwise alignment. The universe of biological sequence analysis

SUPPLEMENTARY INFORMATION

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

The African coelacanth genome provides insights into tetrapod evolution

Annotation of Drosophila grimashawi Contig12

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr

Transcription:

doi:10.1038/nature12414 Supplementary Figure 1 a b Illumina 521M reads, single-end/76 bp Illumina 354M reads, paired-end/ bp raw read processing Trinity assembly primary assembly: 189,820 transcripts Frequency (relative) 0.10 0.05 Assembly primary assembly dddlac remove transcripts without ORF, domain, or blast hit dddlac: 49,394 transcripts Max: 29,521 bp N50: 1,570 bp 0.00 0 00 20000 30000 transcript length (bp) c domain blast hit 4798 3312 2587 17135 7466 1151 12945 ORF > 75 AA De novo assembly of the Dlac transcriptome. a: Flow chart of assembly process, see Methods for details. b: Fractional length distribution of transcripts in primary assembly (red) and cdna-enriched dddlac subset (turquoise). c: Venn diagram representation of dddlac annotations. Numbers refer to the no. of transcripts in each of the listed categories. WWW.NATURE.COM/NATURE 1

RESEARCH Supplementary Figure 2 dddlac Oryctolagus cuniculus Mus musculus 0.25 0.50 0.75 1.00 Maximal Query Coverage Quality analysis of the dddlac assembly. The set of 248 core eukaryotic genes 41 were blasted against dddlac (tblastn with e-value cutoff 0.0001), the hit with the longest HSP was selected and the respective coverage in terms of query length was calculated (1 = HSP extending over entire query length). The Mus muscuslus (mouse; Ensembl v71) and Oryctolagus cuniculus (rabbit; Ensembl v71) transcriptomes were similarly analyzed as assembly references. The distribution of the 248 maximal coverage scores is graphed for each species as a boxplot, with upper and lower box boundaries corresponding to the first and third quartiles and the thick line indicating the median. Black circles mark the outliers of the fourth quartile. The mean maximal core gene coverage for dddlac is 96 % (sd=0.08) and thus within range of the two vertebrate transcriptomes assembled with genomic support. 2 WWW.NATURE.COM/NATURE

WWW.NATURE.COM/NATURE 3 RESEARCH Supplementary Figure 3 a Cele-Lin44 Cele-Mom2 Hsap-Wnt9a Drer-Wnt9a Hsap-Wnt9b Drer-Wnt9b 98.8 Dmel-Wnt4 Hsap-Wnt3 Drer-Wnt3 Drer-Wnt3a Hsap-Wnt3a 93.9 Hsap-Wnt7b Drer-Wnt7b Drer-Wnt7ba Hsap-Wnt7a Drer-Wnt7a 94.7 98.9 Dmel-Wnt2 Hsap-Wnt16 Drer-HypW16 Cele-Egl20 Drer-Wnt2 Drer-Wnt2b Hsap-Wnt2 96.6 Hsap-Wnt2b Hsap-Wnt5a Drer-Wnt5b Hsap-Wnt5b Drer-Wnt5a 97.3 Hsap-Wnt11 Drer-Wnt11r Drer-Wnt11 Smed-Wnt5 Dlac-Wnt5 Dmel-Wnt5 Cele-Cwn2 Hsap-Wnt1 Drer-Wnt1 Dmel-Wg Hsap-Wnt6 Drer-Wnt6 Dmel-Wnt6 99.2 80.1 Hsap-Wnt4 Drer-Wnt4a Drer-Wnt4b 95.5 Cele-Cwn1 Hsap-Wnt8a Drer-Wnt8a Hsap-Wnt8b Drer-Wnt8b 92.2 Hsap-Wnt10a Drer-Wnt10a Drer-Wnt10b Hsap-Wnt10b 94.3 99.7 Dmel-Wnt10 Smed-Wnt11-6 Dlac-Wnt11-6 Smed-Wnt11-5 Dlac-Wnt11-5 Smed-Wnt11-4 Dlac-Wnt11-4 92.6 87.6 94.4 99.9 Smed-Wnt2 Dlac-Wnt2 Smed-Wnt11-3 Smed-Wnt1 Dlac-Wnt1 Smed-Wnt11-2 Dlac-Wnt11-2 Smed-Wnt11-1 Dlac-Wnt11-1 96.2 0.1 Dlac-Wnt11-3*

RESEARCH b 98.9 0.2 Cele-mig 1 Dmel-Fzd 3 Drer-Fzd1 Hsap-Fzd 1 Hsap-Fzd 2 90.8 Drer-Fzd2 Hsap-Fzd 7 Drer-Fzd7b Drer-Fzd7a Drer-Fzd6 92.5 Hsap-Fzd6 Hsap-Fzd3 82.9 83.6 Drer-Fzd3b 82.8 Drer-Fzd3 Dmel-Fzd 7 Smed-Fzd-1/2/7 Dlac-Fzd-1/2/7 Hsap-Fzd10 Drer-Fzd10 Hsap-Fzd9 Drer-Fzd9 80.5 Drer-Fzd9b Drer-Fzd4 93.9 Hsap-Fzd4 Cele- cfz2 Drer-Fzd8b 92.8 Hsap-Fzd 5 Drer-Fzd5 Drer-Fzd8a Hsap-Fzd8 Dmel-Fzd 2 99.9 Dmel-Fzd 5 80.2 Smed-Fzd-5/8-1 Dlac-Fzd-5/8-1 Smed-Fzd-5/8-3 Dlac-Fzd-5/8-3 Smed-Fzd-5/8-4 98.7 Dlac-Fzd-5/8-4 Smed-Fzd-5/8-2 96.7 Dlac-Fzd-5/8-2 98.6 Smed-Fzd-4-2 Dlac-Fzd-4-2 Smed-Fzd-4-1 93.3 Dlac-Fzd-4-1a Dlac-Fzd-4-1b 98.9 Smed-Fzd-4-3 Dlac-Fzd-4-3a Dlac-Fzd-4-3b Smed-Fzd-4-4 Dlac-Fzd-4-4 Dmel-Fzd4 Cele-lin17 4 WWW.NATURE.COM/NATURE

RESEARCH c Analysis of Wnt ligands and Frizzled receptors (Fzd) in Dlac and Smed. a: Neighbor joining phylogenetic tree for the Wnt proteins, Smed Wnts are colored in red and Dlac Wnts in blue. Bootstrap values above 80% are shown. *: Dlac-Wnt11-3 is represented in the dddlac assembly as short fragment and was therefore not included in the alignments. b: Neighbor joining phylogenetic tree for the Fzd proteins, Smed Fzds are colored in red and Dlac Fzds in blue. Bootstrap values above 80% are shown. The Wnt and the Frizzled protein sequences from Homo sapiens (Hsap), Danio rerio (Drer), Drosophila melanogaster (Dmel) and Cenorhabditis elegans (Cele) were retrieved based on their sequence similarity using BLAST in addition to the Dlac and Smed sequences. Domains were predicted using InterProScan and subsequently aligned using ClustalX 2.1. Neighbor joining phylogenetic trees were constructed, excluding aligned positions with gaps and correcting for multiple substitutions, using ClustalX. The neighbor joining method was chosen primarily due WWW.NATURE.COM/NATURE 5

RESEARCH to its speed and comparative simplicity; the phylogenies are intended only to provide an outline of the Fzd and Wnt phylogenetic groupings and to indicate orthology between Smed and Dlac proteins. Dlac Wnts were named based on 2. Planarian Fzds have not been named or analyzed systematically so far. As for the Wnts 2, stringent orthology assignments were difficult, but planarian Fzd proteins clearly partition into three distinct groups. Our naming scheme therefore designates the group by homology to the closest Fzd subfamilies (e.g., Fzd-4; Fzd-5/8; Fzd-1/2/7), followed by a digit designation of the specific group member (e.g., Fzd-4-1 for Fzd-4). We propose the use of a letter postfix to designate putative species-specific paralogs in planarians (e.g., Dlac-Fzd4-1a and Dlac-Fzd4-1b), which is a case not not yet considered by the planarian gene naming convention3. c: Top: Protein sequence alignment of the Fzd domain of Smed-Fzd-4-1, Dlac-Fzd-4-1a and Dlac-Fzd-4-1b using ClustalX. Bottom: Corresponding nucleotide alignments of the two Dlac Fzd domains. High overall sequence homology between Dlac-Frz-4-1a and b, yet frequent substitutions at the amino acid- and especially the nucleotide level, favor a gene duplication event rather than an assembly artifact as explanation for the existence of two Dlac-fzd-4-1 transcripts. 6 WWW.NATURE.COM/NATURE

RESEARCH Supplementary Figure 4 7 d 15 d 22 d 30 d 42 d 56 d 76 d 85 d The posterior head regeneration defect in Dlac is not a delay. Long-term observation of a single tail piece out of a cohort of 4, photographed at indicated times post amputation. Scale bar: 500 µm. Supplementary Figure 5 * Trunk 0 d 5 d Tail * Dlac tail pieces regenerate the body edge. Expression of the edge marker DlaclaminB in uncut Dlac (top) and trunk or tail pieces at indicated time points (n = 3). Asterisk: anterior sucker. Scale bars: 200 µm. W W W. N A T U R E. C O M / N A T U R E 7

RESEARCH Supplementary Figure 6 Fold Change 45 40 35 30 25 20 15 10 5 1 0 Smed Trunk Smed Tail * Gene Smed-ChAT Smed-dach Smed-dlx Smed-eya Smed-FoxD Smed-fzd-5/8-4 Smed-fzd-5/8-2 Smed-ndk Smed-ndl-4 Smed-ndl-6 Smed-notum Smed-opsin Smed-otxB Smed-Pax6A Smed-Pax6B Smed-prep Smed-sFRP-1 Smed-wnt1 Smed-wnt2 0 4 1216 24 48 72 120 0 4 1216 24 48 72 120 Time post amputation (hr) Head marker upregulation at trunk and tail wounds in Smed. RNAseq time course of indicated Smed head marker gene expression in trunk (left) and tail wounds (right). Expression levels are graphed as fold-change relative to the expression level at t 0, time points post amputation as indicated. Color-coding as in Fig. 3a. RNAseq data was not available for trunk time points 72 h and 120 h. Asterisk: fold-change values exceeding y-axis limits. 8 WWW.NATURE.COM/NATURE

RESEARCH Supplementary Figure 7 Trunk Tail 0 h Dlac- ndk 24 h 48 h 72 h In situ verification of Dlac-ndk expression kinetics. Representative trunk (left) and tail wounds (right) are shown at the indicated time points (n = 3/time point). Compare to Dlac-ndk RNAseq trace in Fig. 3a (marked by an asterisk). Scale bars: 200 µm. W W W. N A T U R E. C O M / N A T U R E 9

RESEARCH Supplementary Figure 8 2.5 2.0 1.5 Fold Change 1.0 RNAseq 0.5 0.0 0 2 5 Time post amputation (day) qpcr verification of Dlac-wnt11-5 expression time course at tail wounds. Dlacwnt11-5 levels were individually quantified in RNA samples from 8 tail pieces each at 0, 48 h and 120 h post amputation. Individual measurements (black circles, averages of three technical replicates) are plotted as fold-change relative to the mean expression level of the 0 h samples. Error bars designate 1 standard deviation of the mean. The RNAseq quantification of Dlac-wnt11-5 fold-change expression relative to 0 h (red line) remains within the range of the qpcr quantification. 10 WWW.NATURE.COM/NATURE

RESEARCH Supplementary Figure 9 5/6 4/5 6/6 Role of Dlac Wnt components in tail patterning. Triple Dlac-wnt11-1,-2,-5(RNAi) cannot rescue head regeneration on tail pieces, but prevents general tail regeneration. Representative head, trunk and tail fragments at 16 dpa of RNAi-injected animals (see Methods). Number of fragments displaying the shown phenotype/total number of fragments per category is indicated. For controls, see Fig. 1a. Scale bars: 500 µm. W W W. N A T U R E. C O M / N A T U R E 1 1

RESEARCH Supplementary Figure 10 5/9 Dlac-APC(RNAi) Dlac-wnt11-2 5/5 Dlac-APC(RNAi) transforms tail piece blastemas into tails. Top: Representative tail fragment at 30 dpa from RNAi-injected animal (see Methods). For controls, see Fig. 1a. Bottom: Tail marker Dlac-wnt-11-2 expression in 30 d post amputation tail fragment, confirming tail identity of triangular outgrowth (notice tip staining). For control, see Fig. 1d. Number of fragments displaying the triangular outgrowth or anterior wnt-11-2 expression/total number of fragments is indicated. Scale bars: 500 µm. 12 WWW.NATURE.COM/NATURE

RESEARCH Supplementary Figure 11 1.00 Relative mrna expression of Dlac- Cat-1(RNAi) / Ctrl 0.75 0.50 0.25 0.00 Dlac- Cat-1 level Dlac- Cat-2 level Efficient and specific reduction of Dlac-beta-Catenin-1 RNA levels by RNAi. qpcr quantification of Dlac-beta-Catenin-1 mrna and Dlac-beta-Catenin-2 mrna (specificity control) in total RNA samples from two Dlac-beta-Catenin- 1(RNAi) injected animals, isolated 3 days post the last injection (see Methods). Measurements were normalized against equivalent RNA samples from control animals. Error bars: Standard deviation of the mean between the two biological replicates. WWW.NATURE.COM/NATURE 13

RESEARCH Supplementary Figure 12 22/24 Dlac-beta-Catenin-1(RNAi) 20/24 Tail-to-head conversion in Dlac-beta-Catenin-1(RNAi) animals. Head and trunk fragment of Dlac-beta-Catenin-1(RNAi) animal (see Methods) at 21 dpa. Number of fragments displaying the shown phenotype/total number of fragments per category is indicated. Scale bars: 500 µm. 14 WWW.NATURE.COM/NATURE

RESEARCH Supplementary Figure 13 5/25 Dlac-wnt11-2 5/5 Ectopic tail regeneration in a wild type Dlac tail piece. Top: Tail fragment at 30 dpa from wild type Dlac. For control, see Fig. 1a. Bottom: Tail marker Dlac-wnt-11-2 expression in 30 dpa tail fragment, confirming triangular outgrowth as tail (notice tip staining). For control, see Fig. 1d. Number of fragments displaying the triangular outgrowth or anterior wnt-11-2 expression/total number of fragments is indicated. Scale bars: 500 µm. Double tails were observed infrequently, but consistently in a small proportion of animals. The phenomenon has been noticed before and estimated to occur at a frequency of 4 out of 31 cases 43, even though tail identity could not be ascertained in these studies due to lack of appropriate markers. Interestingly, blastema extracts have been reported to increase the proportion of tail fragments developing outgrowths 44,45 and the morphology of the examples shown indicate that they were most likely tails. WWW.NATURE.COM/NATURE 15

RESEARCH Supplementary Figure 14 40 10 7.5 Fold Change Dlac-wnt1 Trunk Dlac-wnt1 Tail 30 5.0 qpcr RNAseq Fold Change 20 2.5 1 Time post 0 4 16 48 0 4 16 48 amputation (hr) Species Smed Dlac Wound Trunk Tail 10 Gene wnt1 notum 1 0 4 12 16 24 48 72 120 Time post amputation (hr) Greatly reduced wnt1/notum early wounding response in Dlac. RNAseq time course of wnt1 and notum expression at Dlac or Smed trunk and tail wounds. Expression levels are graphed as fold-change relative to the expression level at t 0, time points post amputation as indicated. RNAseq data was not available for Smed trunk time points 72 h and 120 h. Inset: qpcr verification of Dlac-wnt1 expression time course at trunk and tail wounds. Dlac-wnt1 levels were quantified in RNA samples from 5 pooled regenerating trunk- or regeneration impaired tail wounds at 0, 4 h, 16 h, 120 h post amputation. qpcr measurements (black) were plotted as foldchange relative to the expression level at t 0, superimposed on the respective RNAseq trace (turquoise) from the main figure. 16 WWW.NATURE.COM/NATURE

RESEARCH References 41. Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061 1067 (2007). 42. Reddien, P. W., Newmark, P. A. & Sánchez Alvarado, A. Gene nomenclature guidelines for the planarian Schmidtea mediterranea. Dev Dyn 237, 3099 3101 (2008). 43. Bautz, A. Possibilités de régénération antérieure chez des fragments postpharyngiens de Dendrocoelum lacteum. Bulletin de la Société Zoologique de France 103, 403 (1978). 44. Sauzin-Monnot, M.-J. Action de broyats de blastèmes de régénération sur l activité synthéthique de fragments postérieurs de Planaires Dendrocoelum leacteum, sectionnées en arrière du pharynx. Comptes rendus des seances de l'academie des Sciences / D 282, 1885 1888 (1976). 45. Sauzin-Monnot, M.-J. Effets d homogénats fractionnés de blastèmes de régénération sur des fragments postérieurs de Dendrocoelum leacteum. Rôle possible des sécrétions nerveuses. Comptes rendus des seances de l'academie des Sciences / D 290, 351 354 (1980). WWW.NATURE.COM/NATURE 17