doi:10.1038/nature12414 Supplementary Figure 1 a b Illumina 521M reads, single-end/76 bp Illumina 354M reads, paired-end/ bp raw read processing Trinity assembly primary assembly: 189,820 transcripts Frequency (relative) 0.10 0.05 Assembly primary assembly dddlac remove transcripts without ORF, domain, or blast hit dddlac: 49,394 transcripts Max: 29,521 bp N50: 1,570 bp 0.00 0 00 20000 30000 transcript length (bp) c domain blast hit 4798 3312 2587 17135 7466 1151 12945 ORF > 75 AA De novo assembly of the Dlac transcriptome. a: Flow chart of assembly process, see Methods for details. b: Fractional length distribution of transcripts in primary assembly (red) and cdna-enriched dddlac subset (turquoise). c: Venn diagram representation of dddlac annotations. Numbers refer to the no. of transcripts in each of the listed categories. WWW.NATURE.COM/NATURE 1
RESEARCH Supplementary Figure 2 dddlac Oryctolagus cuniculus Mus musculus 0.25 0.50 0.75 1.00 Maximal Query Coverage Quality analysis of the dddlac assembly. The set of 248 core eukaryotic genes 41 were blasted against dddlac (tblastn with e-value cutoff 0.0001), the hit with the longest HSP was selected and the respective coverage in terms of query length was calculated (1 = HSP extending over entire query length). The Mus muscuslus (mouse; Ensembl v71) and Oryctolagus cuniculus (rabbit; Ensembl v71) transcriptomes were similarly analyzed as assembly references. The distribution of the 248 maximal coverage scores is graphed for each species as a boxplot, with upper and lower box boundaries corresponding to the first and third quartiles and the thick line indicating the median. Black circles mark the outliers of the fourth quartile. The mean maximal core gene coverage for dddlac is 96 % (sd=0.08) and thus within range of the two vertebrate transcriptomes assembled with genomic support. 2 WWW.NATURE.COM/NATURE
WWW.NATURE.COM/NATURE 3 RESEARCH Supplementary Figure 3 a Cele-Lin44 Cele-Mom2 Hsap-Wnt9a Drer-Wnt9a Hsap-Wnt9b Drer-Wnt9b 98.8 Dmel-Wnt4 Hsap-Wnt3 Drer-Wnt3 Drer-Wnt3a Hsap-Wnt3a 93.9 Hsap-Wnt7b Drer-Wnt7b Drer-Wnt7ba Hsap-Wnt7a Drer-Wnt7a 94.7 98.9 Dmel-Wnt2 Hsap-Wnt16 Drer-HypW16 Cele-Egl20 Drer-Wnt2 Drer-Wnt2b Hsap-Wnt2 96.6 Hsap-Wnt2b Hsap-Wnt5a Drer-Wnt5b Hsap-Wnt5b Drer-Wnt5a 97.3 Hsap-Wnt11 Drer-Wnt11r Drer-Wnt11 Smed-Wnt5 Dlac-Wnt5 Dmel-Wnt5 Cele-Cwn2 Hsap-Wnt1 Drer-Wnt1 Dmel-Wg Hsap-Wnt6 Drer-Wnt6 Dmel-Wnt6 99.2 80.1 Hsap-Wnt4 Drer-Wnt4a Drer-Wnt4b 95.5 Cele-Cwn1 Hsap-Wnt8a Drer-Wnt8a Hsap-Wnt8b Drer-Wnt8b 92.2 Hsap-Wnt10a Drer-Wnt10a Drer-Wnt10b Hsap-Wnt10b 94.3 99.7 Dmel-Wnt10 Smed-Wnt11-6 Dlac-Wnt11-6 Smed-Wnt11-5 Dlac-Wnt11-5 Smed-Wnt11-4 Dlac-Wnt11-4 92.6 87.6 94.4 99.9 Smed-Wnt2 Dlac-Wnt2 Smed-Wnt11-3 Smed-Wnt1 Dlac-Wnt1 Smed-Wnt11-2 Dlac-Wnt11-2 Smed-Wnt11-1 Dlac-Wnt11-1 96.2 0.1 Dlac-Wnt11-3*
RESEARCH b 98.9 0.2 Cele-mig 1 Dmel-Fzd 3 Drer-Fzd1 Hsap-Fzd 1 Hsap-Fzd 2 90.8 Drer-Fzd2 Hsap-Fzd 7 Drer-Fzd7b Drer-Fzd7a Drer-Fzd6 92.5 Hsap-Fzd6 Hsap-Fzd3 82.9 83.6 Drer-Fzd3b 82.8 Drer-Fzd3 Dmel-Fzd 7 Smed-Fzd-1/2/7 Dlac-Fzd-1/2/7 Hsap-Fzd10 Drer-Fzd10 Hsap-Fzd9 Drer-Fzd9 80.5 Drer-Fzd9b Drer-Fzd4 93.9 Hsap-Fzd4 Cele- cfz2 Drer-Fzd8b 92.8 Hsap-Fzd 5 Drer-Fzd5 Drer-Fzd8a Hsap-Fzd8 Dmel-Fzd 2 99.9 Dmel-Fzd 5 80.2 Smed-Fzd-5/8-1 Dlac-Fzd-5/8-1 Smed-Fzd-5/8-3 Dlac-Fzd-5/8-3 Smed-Fzd-5/8-4 98.7 Dlac-Fzd-5/8-4 Smed-Fzd-5/8-2 96.7 Dlac-Fzd-5/8-2 98.6 Smed-Fzd-4-2 Dlac-Fzd-4-2 Smed-Fzd-4-1 93.3 Dlac-Fzd-4-1a Dlac-Fzd-4-1b 98.9 Smed-Fzd-4-3 Dlac-Fzd-4-3a Dlac-Fzd-4-3b Smed-Fzd-4-4 Dlac-Fzd-4-4 Dmel-Fzd4 Cele-lin17 4 WWW.NATURE.COM/NATURE
RESEARCH c Analysis of Wnt ligands and Frizzled receptors (Fzd) in Dlac and Smed. a: Neighbor joining phylogenetic tree for the Wnt proteins, Smed Wnts are colored in red and Dlac Wnts in blue. Bootstrap values above 80% are shown. *: Dlac-Wnt11-3 is represented in the dddlac assembly as short fragment and was therefore not included in the alignments. b: Neighbor joining phylogenetic tree for the Fzd proteins, Smed Fzds are colored in red and Dlac Fzds in blue. Bootstrap values above 80% are shown. The Wnt and the Frizzled protein sequences from Homo sapiens (Hsap), Danio rerio (Drer), Drosophila melanogaster (Dmel) and Cenorhabditis elegans (Cele) were retrieved based on their sequence similarity using BLAST in addition to the Dlac and Smed sequences. Domains were predicted using InterProScan and subsequently aligned using ClustalX 2.1. Neighbor joining phylogenetic trees were constructed, excluding aligned positions with gaps and correcting for multiple substitutions, using ClustalX. The neighbor joining method was chosen primarily due WWW.NATURE.COM/NATURE 5
RESEARCH to its speed and comparative simplicity; the phylogenies are intended only to provide an outline of the Fzd and Wnt phylogenetic groupings and to indicate orthology between Smed and Dlac proteins. Dlac Wnts were named based on 2. Planarian Fzds have not been named or analyzed systematically so far. As for the Wnts 2, stringent orthology assignments were difficult, but planarian Fzd proteins clearly partition into three distinct groups. Our naming scheme therefore designates the group by homology to the closest Fzd subfamilies (e.g., Fzd-4; Fzd-5/8; Fzd-1/2/7), followed by a digit designation of the specific group member (e.g., Fzd-4-1 for Fzd-4). We propose the use of a letter postfix to designate putative species-specific paralogs in planarians (e.g., Dlac-Fzd4-1a and Dlac-Fzd4-1b), which is a case not not yet considered by the planarian gene naming convention3. c: Top: Protein sequence alignment of the Fzd domain of Smed-Fzd-4-1, Dlac-Fzd-4-1a and Dlac-Fzd-4-1b using ClustalX. Bottom: Corresponding nucleotide alignments of the two Dlac Fzd domains. High overall sequence homology between Dlac-Frz-4-1a and b, yet frequent substitutions at the amino acid- and especially the nucleotide level, favor a gene duplication event rather than an assembly artifact as explanation for the existence of two Dlac-fzd-4-1 transcripts. 6 WWW.NATURE.COM/NATURE
RESEARCH Supplementary Figure 4 7 d 15 d 22 d 30 d 42 d 56 d 76 d 85 d The posterior head regeneration defect in Dlac is not a delay. Long-term observation of a single tail piece out of a cohort of 4, photographed at indicated times post amputation. Scale bar: 500 µm. Supplementary Figure 5 * Trunk 0 d 5 d Tail * Dlac tail pieces regenerate the body edge. Expression of the edge marker DlaclaminB in uncut Dlac (top) and trunk or tail pieces at indicated time points (n = 3). Asterisk: anterior sucker. Scale bars: 200 µm. W W W. N A T U R E. C O M / N A T U R E 7
RESEARCH Supplementary Figure 6 Fold Change 45 40 35 30 25 20 15 10 5 1 0 Smed Trunk Smed Tail * Gene Smed-ChAT Smed-dach Smed-dlx Smed-eya Smed-FoxD Smed-fzd-5/8-4 Smed-fzd-5/8-2 Smed-ndk Smed-ndl-4 Smed-ndl-6 Smed-notum Smed-opsin Smed-otxB Smed-Pax6A Smed-Pax6B Smed-prep Smed-sFRP-1 Smed-wnt1 Smed-wnt2 0 4 1216 24 48 72 120 0 4 1216 24 48 72 120 Time post amputation (hr) Head marker upregulation at trunk and tail wounds in Smed. RNAseq time course of indicated Smed head marker gene expression in trunk (left) and tail wounds (right). Expression levels are graphed as fold-change relative to the expression level at t 0, time points post amputation as indicated. Color-coding as in Fig. 3a. RNAseq data was not available for trunk time points 72 h and 120 h. Asterisk: fold-change values exceeding y-axis limits. 8 WWW.NATURE.COM/NATURE
RESEARCH Supplementary Figure 7 Trunk Tail 0 h Dlac- ndk 24 h 48 h 72 h In situ verification of Dlac-ndk expression kinetics. Representative trunk (left) and tail wounds (right) are shown at the indicated time points (n = 3/time point). Compare to Dlac-ndk RNAseq trace in Fig. 3a (marked by an asterisk). Scale bars: 200 µm. W W W. N A T U R E. C O M / N A T U R E 9
RESEARCH Supplementary Figure 8 2.5 2.0 1.5 Fold Change 1.0 RNAseq 0.5 0.0 0 2 5 Time post amputation (day) qpcr verification of Dlac-wnt11-5 expression time course at tail wounds. Dlacwnt11-5 levels were individually quantified in RNA samples from 8 tail pieces each at 0, 48 h and 120 h post amputation. Individual measurements (black circles, averages of three technical replicates) are plotted as fold-change relative to the mean expression level of the 0 h samples. Error bars designate 1 standard deviation of the mean. The RNAseq quantification of Dlac-wnt11-5 fold-change expression relative to 0 h (red line) remains within the range of the qpcr quantification. 10 WWW.NATURE.COM/NATURE
RESEARCH Supplementary Figure 9 5/6 4/5 6/6 Role of Dlac Wnt components in tail patterning. Triple Dlac-wnt11-1,-2,-5(RNAi) cannot rescue head regeneration on tail pieces, but prevents general tail regeneration. Representative head, trunk and tail fragments at 16 dpa of RNAi-injected animals (see Methods). Number of fragments displaying the shown phenotype/total number of fragments per category is indicated. For controls, see Fig. 1a. Scale bars: 500 µm. W W W. N A T U R E. C O M / N A T U R E 1 1
RESEARCH Supplementary Figure 10 5/9 Dlac-APC(RNAi) Dlac-wnt11-2 5/5 Dlac-APC(RNAi) transforms tail piece blastemas into tails. Top: Representative tail fragment at 30 dpa from RNAi-injected animal (see Methods). For controls, see Fig. 1a. Bottom: Tail marker Dlac-wnt-11-2 expression in 30 d post amputation tail fragment, confirming tail identity of triangular outgrowth (notice tip staining). For control, see Fig. 1d. Number of fragments displaying the triangular outgrowth or anterior wnt-11-2 expression/total number of fragments is indicated. Scale bars: 500 µm. 12 WWW.NATURE.COM/NATURE
RESEARCH Supplementary Figure 11 1.00 Relative mrna expression of Dlac- Cat-1(RNAi) / Ctrl 0.75 0.50 0.25 0.00 Dlac- Cat-1 level Dlac- Cat-2 level Efficient and specific reduction of Dlac-beta-Catenin-1 RNA levels by RNAi. qpcr quantification of Dlac-beta-Catenin-1 mrna and Dlac-beta-Catenin-2 mrna (specificity control) in total RNA samples from two Dlac-beta-Catenin- 1(RNAi) injected animals, isolated 3 days post the last injection (see Methods). Measurements were normalized against equivalent RNA samples from control animals. Error bars: Standard deviation of the mean between the two biological replicates. WWW.NATURE.COM/NATURE 13
RESEARCH Supplementary Figure 12 22/24 Dlac-beta-Catenin-1(RNAi) 20/24 Tail-to-head conversion in Dlac-beta-Catenin-1(RNAi) animals. Head and trunk fragment of Dlac-beta-Catenin-1(RNAi) animal (see Methods) at 21 dpa. Number of fragments displaying the shown phenotype/total number of fragments per category is indicated. Scale bars: 500 µm. 14 WWW.NATURE.COM/NATURE
RESEARCH Supplementary Figure 13 5/25 Dlac-wnt11-2 5/5 Ectopic tail regeneration in a wild type Dlac tail piece. Top: Tail fragment at 30 dpa from wild type Dlac. For control, see Fig. 1a. Bottom: Tail marker Dlac-wnt-11-2 expression in 30 dpa tail fragment, confirming triangular outgrowth as tail (notice tip staining). For control, see Fig. 1d. Number of fragments displaying the triangular outgrowth or anterior wnt-11-2 expression/total number of fragments is indicated. Scale bars: 500 µm. Double tails were observed infrequently, but consistently in a small proportion of animals. The phenomenon has been noticed before and estimated to occur at a frequency of 4 out of 31 cases 43, even though tail identity could not be ascertained in these studies due to lack of appropriate markers. Interestingly, blastema extracts have been reported to increase the proportion of tail fragments developing outgrowths 44,45 and the morphology of the examples shown indicate that they were most likely tails. WWW.NATURE.COM/NATURE 15
RESEARCH Supplementary Figure 14 40 10 7.5 Fold Change Dlac-wnt1 Trunk Dlac-wnt1 Tail 30 5.0 qpcr RNAseq Fold Change 20 2.5 1 Time post 0 4 16 48 0 4 16 48 amputation (hr) Species Smed Dlac Wound Trunk Tail 10 Gene wnt1 notum 1 0 4 12 16 24 48 72 120 Time post amputation (hr) Greatly reduced wnt1/notum early wounding response in Dlac. RNAseq time course of wnt1 and notum expression at Dlac or Smed trunk and tail wounds. Expression levels are graphed as fold-change relative to the expression level at t 0, time points post amputation as indicated. RNAseq data was not available for Smed trunk time points 72 h and 120 h. Inset: qpcr verification of Dlac-wnt1 expression time course at trunk and tail wounds. Dlac-wnt1 levels were quantified in RNA samples from 5 pooled regenerating trunk- or regeneration impaired tail wounds at 0, 4 h, 16 h, 120 h post amputation. qpcr measurements (black) were plotted as foldchange relative to the expression level at t 0, superimposed on the respective RNAseq trace (turquoise) from the main figure. 16 WWW.NATURE.COM/NATURE
RESEARCH References 41. Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061 1067 (2007). 42. Reddien, P. W., Newmark, P. A. & Sánchez Alvarado, A. Gene nomenclature guidelines for the planarian Schmidtea mediterranea. Dev Dyn 237, 3099 3101 (2008). 43. Bautz, A. Possibilités de régénération antérieure chez des fragments postpharyngiens de Dendrocoelum lacteum. Bulletin de la Société Zoologique de France 103, 403 (1978). 44. Sauzin-Monnot, M.-J. Action de broyats de blastèmes de régénération sur l activité synthéthique de fragments postérieurs de Planaires Dendrocoelum leacteum, sectionnées en arrière du pharynx. Comptes rendus des seances de l'academie des Sciences / D 282, 1885 1888 (1976). 45. Sauzin-Monnot, M.-J. Effets d homogénats fractionnés de blastèmes de régénération sur des fragments postérieurs de Dendrocoelum leacteum. Rôle possible des sécrétions nerveuses. Comptes rendus des seances de l'academie des Sciences / D 290, 351 354 (1980). WWW.NATURE.COM/NATURE 17