Is retinoic acid genetic machinery a chordate innovation?

Size: px
Start display at page:

Download "Is retinoic acid genetic machinery a chordate innovation?"

Transcription

1

2 EVOLUTION & DEVELOPMENT 8:5, (2006) Is retinoic acid genetic machinery a chordate innovation? Cristian Cañestro, a John H. Postlethwait, a Roser Gonzàlez-Duarte, b and Ricard Albalat b, a Institute of Neuroscience, University of Oregon, Eugene, OR 97403, USA b Departament de Genètica, Universitat de Barcelona, Av. Diagonal 645, Barcelona, Spain Author for correspondence ( ralbalat@ub.edu) SUMMARY Development of many chordate features depends on retinoic acid (RA). Because the action of RA during development seems to be restricted to chordates, it had been previously proposed that the invention of RA genetic machinery, including RA-binding nuclear hormone receptors (Rars), and the RA-synthesizing and RA-degrading enzymes Aldh1a (Raldh) and Cyp26, respectively, was an important step for the origin of developmental mechanisms leading to the chordate body plan. We tested this hypothesis by conducting an exhaustive survey of the RA machinery in genomic databases for twelve deuterostomes. We reconstructed the evolution of these genes in deuterostomes and showed for the first time that RA genetic machineryf that is Aldh1a, Cyp26, and Rar orthologsfis present in nonchordate deuterostomes. This finding implies that RA genetic machinery was already present during early deuterostome evolution, and therefore, is not a chordate innovation. This new evolutionary viewpoint argues against the hypothesis that the acquisition of gene families underlying RA metabolism and signaling was a key event for the origin of chordates. We propose a new hypothesis in which lineage-specific duplication and loss of RA machinery genes could be related to the morphological radiation of deuterostomes. INTRODUCTION The origin of chordates and their innovative body plan remains controversial (Holland 2005b; Delsuc et al. 2006). Because the action of retinoic acid (RA) in patterning embryonic axes seems to be restricted to chordates, it had been proposed that the morphogenetic role of RA was a chordate novelty linked to the origin of chordate-specific features (Shimeld 1996; Manzanares et al. 2000; Schilling and Knight 2001; Wada 2001; Holland 2005a). Vertebrates regulate RA action at two levels: metabolism and signaling (Fig. 1). Machinery governing RA metabolism includes the RA-synthesizing enzymes (retinaldehyde dehydrogenases Aldh1a, formerly Raldh) and the RA-degrading enzymes (Cyp26), which together regulate the spatio-temporal distribution of RA during embryogenesis (Niederreither et al. 2002; Reijntjes et al. 2005). Machinery for RA-signaling includes the RA-binding nuclear hormone receptors (Rars), which mediate RA action on target genes (e.g., Hox genes) (Marshall et al. 1994). Because Aldh1a, Cyp26 and Rar had been described only in chordates, it had been proposed that the acquisition of these gene families was a key step for the innovation of the chordate body plan (reviewed in Fujiwara and Kawamura 2003). This hypothesis predicts that Aldh1a, Cyp26, and Rar genes should not be found outside the Chordata. To test this hypothesis, we searched for RA genetic machinery in Ambulacraria (echinoderms plus hemichordates), 394 which appears to be the sister group of chordates (Cameron et al. 2000, but see Delsuc et al. 2006). To differentiate orthologs from paralogs, we investigated the phylogenetic relationships of deuterostome genes implicated in the evolution of the RA machinery and closely related families. Understanding the evolution of Aldh1a had been obscure, due to poorly supported gene phylogenies, confusion with the closely related Aldh2 (Fujiwara and Kawamura 2003), and the small number of taxa in which Aldh1a had been reported (only a few vertebrates and ascidians). To illuminate Aldh1a evolution, we identified 73 Aldh1a-related genes in publicly available genome/est databases of seven vertebrates, one cephalochordate, three urochordates, and two nonchordate deuterostomes. In this work we reveal the presence of the three main components of RA machinery, Aldh1a, Cyp26, and Rar, in nonchordate animals, revealing for the first time that these genes were not a chordate innovation. MATERIALS AND METHODS Sequence analysis and identification of new genes Sequences used in this work were assembled from data obtained by in silico screening of public databases (accession numbers and database URLs used in this work are provided in Table A1 in the appendix). Human reference proteins from each analyzed gene family were used as starting queries for BLAST searches (Altschul & 2006 The Author(s) Journal compilation & 2006 Blackwell Publishing Ltd.

3 Canìestro et al. Retinoic acid and chordate origins 395 A B C D Fig. 1. (A) Retinoic acid (RA) genetic machinery regulates RA action at two levels: metabolism and signaling. Aldh1a (red) and Cyp26 (blue) regulate the spatio-temporal distribution of RA. Heterodimers of Rar and Rxr (green and gray, respectively) mediate RA signaling to RA-target genes (e.g., Hox). In contrast to Rar, Rxr can heterodimerize with other nuclear receptors, and its presence in protostomes suggests a more ancient origin (reviewed in Escriva et al. 2000). (B) Gene phylogenies corroborate the orthology of nonchordate Aldh1a, Cyp26, and Rar proteins (colored). Consistent with the presence of Rar in sea-urchin, we also found a sea-urchin Rxr (in bold). Tree branch lengths correspond to neighbor-joining distances, and numbers are the bootstrap values supporting each node (n ; poorly supported nodes o50% were collapsed). The same tree topologies were supported by maximum-likelihood and maximum-parsimony methods. (C) The finding of Aldh1a, Cyp26, and Rar orthologs in Ambulacraria suggests a new evolutionary scenario, in which the RA genetic machinery was already present before the divergence of extant deuterostomes, and consequently, it is not a chordate innovation. (D) The Aldh1a phylogeny illustrates taxon-specific variation of RA-related gene families caused by independent gene duplication and loss during deuterostome evolution. To understand the evolution of the Aldh1a subfamily, it was necessary to include in the analysis the next two most related Aldh families (Aldh2 and Aldh1l), and to consider exon intron organization. The putative Aldh1a2 described in ascidians (Fujiwara and Kawamura 2003) has been renamed here as Aldh1a1/2/3a to reflect its phylogenetic affinities. The position of the urochordate Aldh1a1/ 2/3 cluster close to the Aldh1l family is probably distorted by an artifact due to long branch attraction. Vertebrates: Hs, Homo sapiens; Mm, Mus musculus; Rn, Rattus novergicus; Gg, Gallus gallus; Xt, Xenopus tropicalis; Dr, Danio rerio; and Tr, Takifugu rubripes. Cephalochordates: Bf, Branchiostoma floridae. Urochordates: (larvaceans) Od, Oikopleura dioica; (ascidians) Ci, Ciona intestinalis and Cs, Ciona savignyi; Hemichordates: (acorn-worm) Sk, Saccoglossus kowalevskii; Echinoderms: (sea-urchin) Sp, Strongylocentrotus purpuratus. et al. 1997) against EST and genomic databases. The orthology of proteins was deduced initially by reciprocal best BLAST searches against human genome GenBank database (Wall et al. 2003). From nonassembled genomes, Aldh genes were deduced by assembling cdna and genomic contigs from 1009 trace sequences and 108 ESTs from NCBI. Gene structures and protein sequences were

4 396 EVOLUTION&DEVELOPMENT Vol. 8, No. 5, September^October 2006 deduced after merging the genomic sequences with ESTs when available, or by comparison with well characterized Aldh, Cyp26, and Rar genes described in other species. Predicted genes from automatically annotated genomes were verified by eye, and errors in automatic annotations were corrected to maximize the similarity with ESTs when available, and other known enzymes. The partially predicted Strongylocentrotus purpuratus Rar protein sequence was completed by in silico genomic walking over 70 sequence traces. Zebrafish Aldh1a3 protein was initially predicted from putative exons inappropriately assembled by the Ensembl Zv5 database into three nonoverlapping genomic contigs (Table A1). To verify our zebrafish Aldh1A3 prediction, we amplified embryonic cdna by PCR and cloned a cdna containing the complete coding sequence. Phylogenetic analysis Protein sequence alignments were generated with clustalx (Thompson et al. 1997) and corrected by eye. Only conserved parts of the proteins, whose alignments were unambiguous among paralogs, were considered for the phylogenetic analysis: from codon I40 to I513 of human ALDH1A2 for the Aldh alignment; from codon S74 to E156 and from I236 to S417 of human RARA for nuclear receptor alignment; and from codon P45 to F490 of human CYP26A for cytochrome P450 alignment. The MEGA package (Kumar et al. 2001) was used to construct maximum parsimony, and neighbor-joining phylogenetic trees corrected by a Poisson distribution of amino acid substitutions. A thousand repetitions were run for bootstrap support. TREE-PUZZLE 5.2 (Schmidt et al. 2002) was used to construct maximum-likelihood phylogenetic trees, based on the quartet puzzling procedure and following the JTT model for amino acid substitutions. Subcellular localization prediction The PSORT-II program was used to predict from the amino acid sequence the subcellular localization of the deduced enzymes (Nakai and Horton 1999). PSORT employs the discriminant analysis (called MITDISC ), whose variables are the amino acid composition of the N-terminal 20 residues (Nakai and Kanehisa 1992), to recognize mitochondrial targeting signals. The prediction was performed using the k-nearest data points and a probability for the different subcellular localizations was assigned. RESULTS Identification of Cyp26 and Rar sequences in nonchordate deuterostomes Rar and Cyp26 sequences had previously been isolated only from several vertebrates and a few nonvertebrate species. To learn whether nonchordate deuterostomes have Cyp26 and Rar orthologs, we searched EST and genomic databases of the hemichordate Saccoglossus kowalevskii (acorn-worm) and the echinoderm S. purpuratus (sea-urchin). From this search, we identified Cyp26 sequences in both species and Rar in the S. purpuratus genome. Although no Rar ortholog could be identified in the S. kowalevskii EST database, without a complete genome sequence for this species, we cannot discard the presence of Rar in this hemichordate. Orthologies of the newly identified genes were strongly supported by sequence similarity (Tables A2 and A3) and by reciprocal BLAST against the human databases. In the case of the sea-urchin Rar, the most significant BLAST hits were against the RARs (E-value of 9e-117) and the second next hit was against the THRB (E-value substantially lower, at 4e-69). In the case of the acorn-worm and sea-urchin Cyp26 proteins, the most significant BLAST hits in the human genome were the human CYP26s (E-values were 2e-98 and 9e-72, respectively) and the next hits were against the human CYP51 and CYP3A, respectively (E-values were much lower, at 2e-26 and 9e-21, respectively). Gene phylogenies inferred by maximum likelihood and maximum parsimony (data not shown) showed the same tree topologies as the neighbor-joining tree (Fig. 1B) and thus corroborated the Rar and Cyp26 orthologies inferred by the reciprocal best hit method (Wall et al. 2003). The next closest related families according to the BLAST searches (i.e., CYP51 for CYP26, and thyroid hormone receptor THR for Rar) and other closely related members (CYP4 and retinoid X receptor (RXR)) were used as outgroups in the phylogenetic analyses. In the larvacean urochordate Oikopleura dioica, despite the deep coverage of the genome database (9-fold coverage, Table A1), we did not find any clear Cyp26 or Rar orthologous genes. The fact that BLAST searches using Cyp26 and Rar proteins from multiple organisms as starting queries allowed us to identify phylogenetically more distant genes from the Cyp and nuclear hormone receptor families (e.g., Cyp4, Cyp5, Cyp2, Cyp3, and Thr, Rxr, Err, Ror, data not shown), suggested that Cyp26 and Rar genes have either been lost or their sequences have diverged so much that it is impossible to recognize them by BLAST searches in the larvacean genome database. Identification of Aldh1a and Aldh2 orthologous genes in deuterostomes The high sequence similarity between the Aldh1a and Aldh2 families (Table A4) did not allow us to use the reciprocal BLAST approach to unambiguously ascribe the newly identified genes to one or the other of the two families. For this reason, we decided to identify all putative Aldh1a and Aldh2 genes that might be orthologous to the Aldh1a or Aldh2 families in a large catalogue of deuterostomes. From our Aldh1a Aldh2 survey, we identified new Aldh1a and Aldh2 genes in vertebrates (Danio rerio (zebrafish), Takifugu rubripes (pufferfish), Xenopus tropicalis (frog), Gallus gallus (chicken)), cephalochordates (Branchiostoma floridae (amphioxus)), urochordates (the ascidians Ciona intestinalis and Ciona savignyi, and the larvacean Oikopleura dioca), hemichordates (S. kowalevskii (acorn worm)) and echinoderms (S. purpuratus

5 Canìestro et al. (purple sea urchin)) (Table A1). During our in silico screening, identification of orthologs of the next most closely related family (i.e., Aldh1l; Fig. 1D and Table A1) was considered as evidence that all Aldh1a and Aldh2 genes present in the databases had been retrieved. Except for the hemichordate S. kowalevskii, the deep coverage of the genomic databases screened (coverage is provided in Table A1) and the large number of EST sequences from the organisms analyzed suggested that all Aldh1a orthologs for a given species had likely been identified. To classify the new Aldh sequences, we took advantage of the presence in their gene structures of family-specific signatures in their exon-intron organizations: Aldh1a genes lack intron 4 but include an extra intron 12b, whereas the oppositefpresence of intron 4 and absence of 12bFis characteristic of Aldh2 genes (Fig. 2). The recognition of these family-specific signatures combined with information from phylogenetic analysis, sequence identity, genomic location and prediction of subcellular localization, allowed us to classify confidently the new Aldh proteins into the Retinoic acid and chordate origins 397 differentaldhfamilies.thefactthatdeuterostomealdh1l genes share an overall exon-intron organization that differs from those of Aldh1a and Aldh2 genes (data not shown) suggests that the origin of Aldh1l and the Aldh2/1a gene preceded the duplication and divergence of Aldh2 and Aldh1a. We describe now this complicated gene family in a taxonomic context. Vertebrates Aldh2 Aldh2 enzymes are a group of nuclearly encoded proteins that play a major role in acetaldehyde detoxification in mitochondria. Mitochondrial aldehyde dehydrogenases related to Aldh2 have been found in all eukaryote species so far investigated, suggesting an ancient origin probably preceding the evolution of eukaryotes (Rzhetsky et al. 1997; Yoshida et al. 1998; Perozich et al. 1999; Vasiliou et al. 1999; Sophos and Vasiliou 2003). During our survey, we identified in all vertebrates at least one clear Aldh2 gene. All vertebrate Aldh2 genes shared the same exon intron structure, made of 13 Fig. 2. Schematic comparison of intron distribution in deuterostome Aldh1a and Aldh2 families. Arrowheads indicate intron positions. The 13intronsoftheStrongylocentrotus purpuratus Aldh2 genes are shared by most of Aldh2 and Aldh1a genes, and for this reason, it has been used as reference (numbered 1 13, top; additional lineage-specific introns 7b, 9b, 10b, and 12b, bottom). White arrowheads indicate overall conserved intron positions; black arrowheads on a gray background denote introns 4 and 12b, which define Aldh family signatures: Aldh1a genes lacked intron 4 but included an extra intron 12b, whereas the oppositefpresence of intron 4 and absence of 12bFis characteristic of Aldh2 genes; gray arrowheads indicate lineage-specific introns. Notice the intron-less structure of the Aldh1b1 gene, and that O. dioca Aldh2 has completely reorganized its gene structure (not numbered). Homo sapiens genes represent the conserved vertebrate gene structure. Abbreviations are as in Fig. 1.

6 398 EVOLUTION&DEVELOPMENT Vol. 8, No. 5, September^October 2006 exons and 12 introns (Fig. 2) and coded for enzymes predicted to be located within mitochondria (Table 1). In Zv5 zebrafish genome database we identified two aldh2 genes located contiguously in LG5 (Table A1), transcribed in the same direction, resulting probably from a zebrafish-specific tandem duplication. The two predicted zebrafish Aldh2 proteins (Aldh2a and Aldh2b) are 95.2% identical. In addition to the typical Aldh2 genes, an extra human protein named ALDH1B1 (also known as ALDHx or ALDH5) (Hsu and Chang 1991), showed the highest similarity with ALDH2 in the human genome (Table A4). We identified Aldh1b1 genes in all mammals and amphibians examined, but not in birds and fishes (Table A1). Sequence similarity and phylogenetic analysis indicate that Aldh1b1 genes derived from a vertebrate ancestral duplication within the Aldh2 family (Fig. 1). We propose, therefore, that Aldh1b1 should be included in the Aldh2 family and renamed as Table 1. Prediction of subcellular localization of Aldh2, Aldh1b1, and Aldh1a enzymes Enzyme Mitochondrial Cytoplasmic Nuclear ALDH2 SpAldh2a SpAldh2b SkAldh OdAldh CiAldh CsAldh BfAldh HsALDH HsALDH1B ALDH1a SkAldh1a1/2/ CiAldh1a1/2/3a CiAldh1a1/2/3b CiAldh1a1/2/3c CiAldh1a1/2/3d CsAldh1a1/2/3a CsAldh1a1/2/3b/c CsAldh1a1/2/3b/c BfAldh1a1/2/3a BfAldh1a1/2/3b BfAldh1a1/2/3c BfAldh1a1/2/3d BfAldh1a1/2/3e BfAldh1a1/2/3f HsALDH1A HsALDH1A HsALDH1A PSORT-II program were used to predict the subcellular localization of the deduced Aldh enzymes. The prediction is performed using the k- nearest data points and the probabilities (%) for the different subcellular localizations were calculated. Only mitochondrial, cytoplasmic, and nuclear are shown. The highest value is highlighted in bold. Human enzymes stand for the vertebrate enzymes. Nomenclature is as in Fig. 1. Aldh2a2, and consequently the rest of Aldh2 genes in vertebrates should be renamed as Aldh2a1. The intron-less structure of the coding region of Aldh1b1 genes suggests a retrotranscriptional origin of the new copy during vertebrate evolution. Aldh1a All known vertebrate Aldh1a genes share the same exon-intron organization, having 12 introns. Although Aldh1a and Aldh2 genes both have 12 introns, the organization of these introns differed significantly (Fig. 2). Three main cytosolic retinaldehyde dehydrogenase enzymes, named Aldh1a1, Aldh1a2 and Aldh1a3, are typically found in tetrapods (Sophos and Vasiliou 2003). Rodents are unique in possessing a fourth enzyme, named Aldh1a4 (Dunn et al. 1989) or Aldh1a7 (Hsu et al. 1999). Our phylogenetic analysis suggested that the extra murine genes arose from a duplication of the Aldh1a1 gene in the rodent clade before the divergence of mouse and rat lineages (Fig. 1D). This idea is consistent with the genomic position of the Aldh1a1 and the extra murine genes, which map as neighbors and are divergently transcribed in regions of the mouse and rat genomes that share conserved syntenies (Table A1). From the three Aldh1a members, Aldh1a2 was the only one that had previously been described outside tetrapods (Begemann et al. 2001; Grandel et al. 2002). Our searches in nonmammalian vertebrate databases showed that amphibians and birds also have Aldh1a1 and Aldh1a3 enzymes (Fig. 1 and Table A1) (Godbout 1992; Sockanathan and Jessell 1998; Tsukui et al. 1999; Grun et al. 2000; Suzuki et al. 2000). Zebrafish Aldh1a2 was the only Aldh1a member described so far in fishes (Begemann et al. 2001; Grandel et al. 2002). Screening of fugu and zebrafish databases allowed us to identify, not only aldh1a2 genes, but also aldh1a3 orthologs (Fig. 1D and Table A1). However, aldh1a1 genes were not found in either of the two fish genomes analyzed. Thus, while tetrapods have three Aldh1a members, teleosts seem to have only two, Aldh1a2 and Aldh1a3. Cephalochordates Aldh2 No Aldh enzymes had been previously identified in cephalochordates. From our survey of the B. floriade genome project and the EST database, we have assembled six genomic contigs containing seven genes belonging to the Aldh2 or Aldh1a families (Table A1). Only one of these amphioxus genes showed an intron exon organization with the Aldh2 signature (presence of intron 4 and absence of intron 12b; Fig. 2). The protein predicted from this gene was most similar to vertebrate Aldh2 enzymes (72% against the human ALDH2 vs. 64 1% against ALDH1A proteins) (Table A4). Moreover, this enzyme was the only amphioxus Aldh that grouped within the Aldh2 cluster in the phylogenetic tree (Fig. 1D),

7 Canìestro et al. and rendered a robust prediction for mitochondrial localization (Table 1). Therefore, we could confidently assign this amphioxus enzyme to the Aldh2 family. Aldh1a The other six amphioxus genes coded for Aldh proteins that grouped into a single cluster in the phylogenetic tree (Fig. 1, B and D), suggesting that they might have originated by multiple gene duplications within the cephalochordate lineage. All six amphioxus Aldh proteins were more similar to each other than to any human ALDH2 or ALDH1A enzymes (Table A4). The Aldh1a nature of these proteins was supported by the presence of the Aldh1a signature in their gene structurefabsence of intron 4 and presence of intron 12b (Fig. 2)Fand the predicted cytosolic localization of the deduced proteins (Table 1). Thus, the six amphioxus Aldh1a genes probably originated from an explosion of independent cephalochordate-specific gene duplications from an ancestral Aldh1a1/2/3 gene (i.e., the pro-ortholog of the current vertebrate Aldh1a1, Aldh1a2, and Aldh1a3). We have named the amphioxus genes Aldh1a1/2/3 (a to f) to reflect this evolutionary origin. The fact the amphioxus Aldh1a1/2/3c and Aldh1a1/2/3d genes were in the same genomic contig, and the presence of the amphioxus-specific intron 10b in the Aldh1a1/2/3b-c-d clade (Fig. 2) further supported the origin of these genes from gene duplications within the cephalochordate lineage. Urochordates AsingleC. intestinalis enzyme, called Raldh2 (Nagatomo and Fujiwara 2003), was the only retinaldehyde dehydrogenase that had been previously described in urochordates. As in cephalochordates, our in silico screening of EST and genomic databases of ascidians and larvaceans revealed a remarkable complexity of Aldh genes among urochordates (Table A1). Ascidian Class Aldh2 Only one Aldh gene in each ascidian species, C. intestinalis and C. savignyi, showed the Aldh2-exon/intron signature: presence of intron 4 and absence of 12b (Fig. 2). The proteins predicted to be encoded by these ascidian genes showed the highest similarity against vertebrate Aldh2 enzymes ( % against human ALDH2 vs % against human ALDH1A proteins; Table A4), and grouped at the base of the Aldh2 cluster in the evolutionary tree (Fig. 1D). Subcellular localization programs did not predict mitochondrial localization for these ascidian enzymes (Table 1), an unexpected result considering the high similarity with other Aldh2 proteins. Close inspection of the ascidian Aldh2 sequences revealed shortened and divergent N-terminal regions, which contains the mitochondrial localization signal of these enzymes in other species (Nakai and Kanehisa 1992). We therefore assumed Retinoic acid and chordate origins 399 that these genes represent the ascidian orthologs of Aldh2, although their subcellular localization and biochemical activity will need further investigation. Aldh1a In addition to Aldh2, C. intestinalis and C. savignyi genomes contained four and three putative Aldh1a genes, respectively. The Aldh1a nature of these genes was corroborated by the presence of the Aldh1a signature in their exon intron organization: absence of intron 4 and presence of intron 12b (despite the fact that intron 12b appeared to have been secondarily lost once within the urochordate clade). The fact that many ascidian Aldh1a genes appeared to have lost intron 7 and that the predicted proteins clustered together in the evolutionary trees (Figs. 1, B and D, 3), pointed to a lineagespecific origin by gene duplications in the ascidian lineage. Following the same rationale as with amphioxus, we concluded that these genes would have evolved from an ancestral Aldh1a1/2/3 form and, hence, we have named them as Aldh1a1/2/3Fa, b, c and d for C. intestinalis and a, bc1 and bc2 for C. savignyif(table A1). The long branches of the ascidian Aldh1a1/2/3 cluster probably cause a long-branch-attraction (LBA) artifact and force its phylogenetic position close to Aldh1l in the neighbor-joining tree (Fig. 1D). The use of less sensitive phylogenetic methods (i.e., maximum likelihood and maximum parsimony) to LBA and the elimination of the Aldh1l sequences overcame LBA distortion of the urochordate Aldh1a clade. The internal topology of the ascidian cluster in the evolutionary tree (Figs. 1D and 3) and exon intron comparisons (Fig. 2) indicated that C. savignyi and C. intestinalis Aldh1a1/2/3a (which corresponds to the reported Raldh2 gene; Nagatomo and Fujiwara 2003) were orthologs. The other Aldh1a1/2/3 genes (b, c and d in C. intestinalis and bc1 and bc2 in C. savignyi) lied together in each species, suggesting again that they originated by independent tandem duplications during the evolution of each Ciona species. Larvacean Class Aldh2 Our analysis of the O. dioica genome revealed a single larvacean gene related to the vertebrate Aldh2 or Aldh1a families (Table A1). The complete reorganization of the exon intron structure of this Oikopleura Aldh gene (Fig. 2)Fa feature of many Oikopleura genes (Edvardsen et al. 2004; Can estro et al. 2005)Fdid not permit use of exon organization as a character for ascribing the gene to an Aldh family. The Aldh2 nature of larvacean protein, however, was deduced from the sequence identity (66.7% against human ALDH2 vs % against human ALDH1As) (Table A4), its strong prediction score of mitochondrial localization (Table 1), and its position in the evolutionary tree within the cluster of all other nonvertebrate Aldh2 enzymes (Fig 1D).

8 400 EVOLUTION&DEVELOPMENT Vol. 8, No. 5, September^October 2006 Fig. 3. Phylogenetic analysis of Aldh1a family by maximum-likelihood (A) and maximum-parsimony (B) methods. Numbers are the bootstrap values (%) supporting each node (n ). Poorly supported nodes (o50%) were collapsed. Aldh1a After an exhaustive survey of the larvacean genome database (9-fold coverage, Table A1), no Aldh1a1/2/3 orthologue was recognized in Oikopleura, despite the fact that other genes from more distantly related Aldh families (e.g., Aldh5, Aldh6, Aldh8, Aldh9, and Aldh16, data not shown) were clearly identified. The presence of Aldh1a1/2/3 genes in the other urochordate species suggested a specific gene loss in the larvacean lineage or the absence of the gene in the Oikopleura sequencing project. Hemichordates No Aldh has been previously described in hemichordates. BLAST searches against 101,376 EST trace files sequenced by WIBR/MIT available at NCBI allowed us to assemble six different cdna contigs that coded for Aldh1a Aldh2-related proteins (Table A1). From the six cdnas assembled, only two contained the entire coding sequence, and the other four covered only about 50 70% of the coding sequence. There is currently no genomic sequence database available for S. kowalevskii, and therefore the presence of additional Aldh1a or Aldh2 proteins cannot be discarded. Aldh2 One of the six predicted proteins showed the highest similarity to human ALDH2 proteins (e.g., 76% against human ALDH2 vs % against human ALDH1As; Table A4). Consistent with the Aldh2 nature of this protein, it was the only S. kowalevskii Aldh that grouped within the clade of other nonvertebrate Aldh2 proteins (Fig. 1D) and its score for mitochondrial localization was higher than for cytoplasmatic localization (Table 1). Therefore, we concluded this S. kowalevskii protein is an Aldh2 ortholog. Aldh1a We deduced five other Aldh sequences from the S. kowalevskii EST database (Table A1). The five S. kowalevskii Aldh proteins grouped as a cluster in the phylogenetic tree (Figs. 1, B and D, 3) close to the amphioxus Aldh1a1/2/3 clade, suggesting that they might have originated by multiple gene duplications of an ancestral Aldh1a1/2/3 gene within the hemichordate lineage. Consistent with this lineage-specific origin, all five S. kowalevskii Aldh proteins were more similar to each other than to human ALDH2 or ALDH1A enzymes (Table A4). Echinoderms Aldh2 Ourscreeningofthe seaurchins. purpuratus genome revealed two Aldh genes closely related with the Aldh2 family (Table A1). Proteins predicted from the two sea urchin sequences showed highest similarity to other Aldh2 enzymes (70.5% and

9 Canìestro et al. 67.5% against human ALDH2 vs % and % against human ALDH1As, respectively; Table A4). The two genes shared the same 14-exon organization, and the presence of intron 4 and absence of intron 12b was indicative of the Aldh2 nature (Fig. 2). Accordingly, sea urchin proteins group with other Aldh2 enzymes in evolutionary trees and were predicted to localize to mitochondria (Table 1). We have named the S. purpuratus enzymes Aldh2a and Aldh2b because phylogenetic analysis suggested that both sequences derived from an independent duplication early in echinoderm evolution. Consistently, the recently released sea urchin genome assembly has revealed that both genes are tandemly located in the same genomic region and directly transcribed. Aldh1a All attempts to find Aldh1a1/2/3 orthologous in the sea urchin genome were unsuccessful, even though other genes belonging to more distantly related families, e.g., Aldh1l1, Aldh5 and Aldh16, were clearly identified (data not shown). Although we cannot rule out that an Aldh1a1/2/3 gene could be located in a genomic region still not covered by the current genome project, we favor the hypothesis that the S. purpuratus genome may lack an Aldh1a1/2/3 ortholog. Since Aldh1a1/2/3 enzymes appear to be present in hemichordates, the sister group of echinoderms (Cameron et al. 2000), additional genomes need to be explored to distinguish whether the loss of Aldh1a1/2/3 is specific to sea-urchins or is common to the entire phylum Echinodermata. DISCUSSION To test the hypothesis that the acquisition of RA genetic machinery was a key event for the innovation of developmental mechanisms that produce the chordate body plan, we explored genome databases of chordate and nonchordate deuterostomes. In contrast to the unambiguous identification of Cyp26 and Rar genes, whose orthology could be clearly assigned by reciprocal BLAST searches and phylogenetic analysis (Fig. 1B), the phylogenetic signal of the putative nonvertebrate Aldh1a sequences was not sufficient to conclusively classify them into the Aldh1a or into the closely related Aldh2 families by BLAST analysis only. Different phylogenetic methods, that is maximum likelihood, maximum parsimony and neighbor-joining analysis, split the new identified deuterostome Aldh sequences in two distinct groupsf Aldh1a and Aldh2 (Figs. 1D and 3). Although some nodes of the tree did not have high statistical support, the assigned orthology of these two groups to the Aldh1a and Aldh2 families was consistent with the fact that most deuterostome lineages possessed a representative in each Aldh group (see code of colors in Fig. 1D). To overcome the lack of resolution Retinoic acid and chordate origins 401 of the phylogenetic analysis and to circumvent the inherent limitation of the phylogenetic signal of the Aldh1a Aldh2 sequences, we took advantage of other types of information, whose homoplasy is considered to be low (reviewed in Rokas and Holland 2000), such as gene structures (i.e., intron indels) and a specific sequence motif that, in our case, determines the subcellular localization of the proteins. Thus, the recognition of family-specific signatures in the exon intron organizations of Aldh1a and Aldh2 families (Fig. 2) was especially useful to support orthologies inferred from the topologies of the evolutionary trees (Fig. 1), and provided a useful tool for future discrimination between deuterostome Aldh1a and Aldh2 genes. Our screening of echinoderm and hemichordate databases revealed that orthologs of Aldh1a, Cyp26,andRar genes exist outside the phylum Chordata (Fig. 1B). This discovery means that the gene families for RA metabolism and RA signaling were present before the divergence of extant deuterostomes, and therefore, they are not a chordate innovation (Fig. 1C). This conclusion argues against the hypothesis that the acquisition of gene families underlying RA metabolism and signaling was a key event for the evolution of developmental mechanisms that produce the chordate body plan. In our search for the RA genetic machinery, we found remarkable variability in gene family size in different deuterostome lineages, especially in Aldh1a genes (Fig. 1D). There are at least five Aldh1a genes in hemichordates, four in urochordates, and six in cephalochordates (Fig. 1D). Among vertebrates, tetrapods have three main Aldh1a genes, but fish have only two. Independent gene duplication in most deuterostome lineages from a single Aldh1a1/2/3 pro-ortholog present in the stem deuterostome is the most parsimonious hypothesis to explain the origin of the present wide catalog of deuterostome Aldh1a proteins. This hypothesis is supported by our findings: (i) Aldh1a paralogs in the same taxon branch together in the phylogenetic tree; (ii) Aldh1a paralogs in the same taxon share lineage-specific introns; and (iii) many Aldh1a genes paralogs in the same taxon lie in tandem in the same genomic region. It cannot be discarded, however, the possibility that the present catalog of deuterostome Aldh1a genes results from a complex pattern of gene conversion and lineage-specific gene duplications (and loses) from an original extensive set of Aldh1a1/2/3 genes that was already present in the stem deuterostome. For vertebrates, this possibility is unlikely because ALDH1A1 in human chromosome (Hsa) Hsa9q21.13 and ALDH1A2 and ALDH1A3 in Hsa15q22.1 and Hsa15q26.3 (Table A1), respectively, occupy two paralogous chromosome regions that arose in the two rounds of whole genome duplication that occurred at about the time of vertebrate origins (Dehal and Boore 2005). The future assembly of nonvertebrate genomes will help to illuminate the evolutionary origin of the complex catalog of deuterostome Aldh1 genes. Under either hypothesis, however, the main

10 402 EVOLUTION&DEVELOPMENT Vol. 8, No. 5, September^October 2006 conclusion holds that one or more Aldh1a1/2/3 pro-ortholog was already present in the stem deuterostome. Our genomic screening reveals that the Aldh1a family not only experienced extensive amplification, but also independent losses in sea-urchin and larvacean urochordates (Fig. 1C). In these two taxa, exogenous RA treatments do not cause homeotic changes of anterior to posterior structures (Sciarrino and Matranga 1995; and C. Can estro, unpublished data). In sea-urchin, the absence of Aldh1a could be functionally compensated by enzymes from other Aldh families, for example Aldh8a1 is capable, at least in vitro, of oxidizing retinal, but the contribution of Aldh8a1 to RA signaling is not yet fully understood (Lin and Napoli 2000). Larvaceans, however, also appear to lack Cyp26 and Rar genes, suggesting that RA machinery might have been lost or modified beyond recognition during larvacean evolution (Fig. 1C); this loss questions the contribution of RA signaling in Oikopleura developmental patterning. Overall, our data reveal that the RA genetic machinery has diversified substantially in different deuterostome lineages, raising the possibility that these differences could be related to the morphological diversity of extant deuterostomes by, for example, modifying the spatiotemporal distribution of RA during embryogenesis. Our work definitively answered the question of whether the genetic machinery for RA signaling was invented by chordates. The answer pushes back in evolutionary time our understanding of the origins of this machinery, but raises another question, for which there is currently insufficient data to draw a firm conclusion: What is the earliest diverging taxon that possesses components of the RA signaling network? In an attempt to narrow down the origin of RA machinery during animal evolution, we explored available genomic and EST databases of nondeuterostomes (e.g., the ecdysozoan protostomes Drosophila, Anopheles, Aedes, and C. elegans, and the radiata Hydra and Nematostella). Although this analysis did not reveal any convincing Rar, Cyp26, or Aldh1a orthologs, the taxonomic diversity is too narrow and some current genomic databases of nondeuterostomes are too shallow to conclude that the RA genetic machinery was a deuterostome innovation. Analysis of this question must await deeper databases and broader phylogenetic sampling. In conclusion, our analysis shows for the first time the presence of the three main components of RA metabolism and signaling in nonchordates. We conclude that the gene families for RA metabolism and RA signaling were already present in early deuterostome evolution, thereby calling into question the invention of the RA genetic machinery as a basis for the innovation of developmental mechanisms leading to chordate-specific features. Our new evolutionary scenario raises a number of new questions. Did the ancestral deuterostome Aldh1a and Cyp26 proteins actually metabolize retinoids? Was RA the ligand for the ancestral deuterostome Rar and, if so, did RA already act as a morphogen regulating the expression of Hox genes? If the ancestral deuterostome did synthesize, bind, and breakdown RA with its Aldh1a, Rar, and Cyp26 genes, then we must view the deuterostome ancestor as more chordate like than is generally assumed (Gerhart et al. 2005). On the other hand, if the Aldh1a, Rar, and Cyp26 proteins were present in stem deuterostomes, but were not acting in RA metabolism, then the discovery of which of today s lineages actually interact with RA will help us to understand when various innovations were acquired in the evolution of chordate-specific features. How do our conclusions change if the classical model of deuterostome evolution we assume in Fig. 1C is replaced by a new phylogenetic hypothesis suggesting that urochordates are the closest living relative to vertebrates and chordates are no longer monophyletic because cephalochordates are more closely related to Ambulacraria than to vertebrates (Delsuc et al. 2006)? The newly proposed evolutionary scenario does not alter our main conclusion that the RA genetic machinery was already present in the stem deuterostome. If, however, cephalochordates were the sister group of Ambulacraria, it can be inferred that the deuterostome ancestor already had a surprising number of chordate features and RA genetic machinery was already involved in axial patterning during development. Under either the classical view or new phylogenetic model, it is possible that the RA genetic machinery was fully functional in the stem deuterostome, and during the deuterostome radiation, the components of the RA machinery evolved heterogeneously in different lineages by being preserved, recruited, duplicated, or lost from the molecular genetic network controlling the development of lineage-specific morphological features, such as axial patterning, the development of body symmetries, the central nervous system, sensory cells, and endodermal derivatives (Shimeld, 1996; Hinman and Degnan 1998; Hinman and Degnan 2000; Manzanares et al. 2000; Ross et al. 2000; Schilling and Knight 2001; Wada 2001; Escriva et al. 2002; Holland, 2005a; Vermot and Pourquie 2005; C. Can estro, unpublished data). The possibility that this heterogenous evolution of the RA components may have favored morphological radiation among deuterostomes is consistent with the multiple independent gene duplications and losses we found among different deuterostome taxa. Finally, the new Aldh1a, Cyp26, and Rar genes identified here are significant because they provide genomic information necessary to design functional experiments to investigate the developmental roles of RA across a variety of taxa, and in doing so, improve our understanding of deuterostome radiation and the evolutionary origin of chordate developmental mechanisms. Acknowledgments For generously making genome sequences publicly available, we thank D. Chourrout and Genoscope for O. dioica; L. Holland, J. Gibson-Brown, and JGI for B. floridae; and J. Aronowicz, C. J.

11 Canìestro et al. Lowe, and WIBR/MIT for S. kowalevskii. This material is based on work supported by NSF Grant IBN to J. H. P. and C. C., HD22486 to J. H. P., by Ministerio de Ciencia y Tecnologıá (Spain), grant BMC to R. G. D. and R. A., and EX to C. C., and by DURSI (Generalitat de Catalunya), grant 2005BE00080 to R. A. REFERENCES Altschul, S. F., et al Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: Begemann,G.,Schilling,T.F.,Rauch,G.J.,Geisler,R.,andIngham,P. W The zebrafish neckless mutation reveals a requirement for raldh2 in mesodermal signals that pattern the hindbrain. Development 128: Cameron, C. B., Garey, J. R., and Swalla, B. J Evolution of the chordate body plan: new insights from phylogenetic analyses of deuterostome phyla. Proc. Natl. Acad. Sci. USA 97: Can estro, C., Bassham, S., and Postlethwait, J Development of the central nervous system in the larvacean Oikopleura dioica and the evolution of the chordate brain. Dev. Biol. 285: Dehal, P., and Boore, J. L Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol. 3: e314. Delsuc, F., Brinkmann, H., Chourrout, D., and Philippe, H Tunicates and not cephalochordates are the closest living relatives of vertebrates. Nature 439: Dunn,T.J.,Koleske,A.J.,Lindahl,R.,andPitot,H.C.1989.Phenobarbital-inducible aldehyde dehydrogenase in the rat. cdna sequence and regulation of the mrna by phenobarbital in responsive rats. J. Biol. Chem. 264: Edvardsen, R. B., et al Hypervariable and highly divergent intronexon organizations in the chordate Oikopleura dioica. J. Mol. Evol. 59: Escriva, H., Delaunay, F., and Laudet, V Ligand binding and nuclear receptor evolution. Bioessays 22: Escriva, H., Holland, N. D., Gronemeyer, H., Laudet, V., and Holland, L. Z The retinoic acid signaling pathway regulates anterior/posterior patterning in the nerve cord and pharynx of amphioxus, a chordate lacking neural crest. Development 129: Fujiwara, S., and Kawamura, K Acquisition of retinoic acid signaling pathway and innovation of the chordate body plan. Zool. Sci. 20: Gerhart, J., Lowe, C., and Kirschner, M Hemichordates and the origin of chordates. Curr. Opin. Genet. Dev. 15: Godbout, R High levels of aldehyde dehydrogenase transcripts in the undifferentiated chick retina. Exp. Eye. Res. 54: Grandel, H., et al Retinoic acid signalling in the zebrafish embryo is necessary during pre-segmentation stages to pattern the anterior posterior axis of the CNS and to induce a pectoral fin bud. Development 129: Grun, F., Hirose, Y., Kawauchi, S., Ogura, T., and Umesono, K Aldehyde dehydrogenase 6, a cytosolic retinaldehyde dehydrogenase prominently expressed in sensory neuroepithelia during development. J. Biol. Chem. 275: Hinman, V. F., and Degnan, B. M Retinoic acid disrupts anterior ectodermal and endodermal development in ascidian larvae and postlarvae. Dev. Genes Evol. 208: Hinman, V. F., and Degnan, B. M Retinoic acid perturbs Otx gene expression in the ascidian pharynx. Dev. Genes Evol. 210: Holland, L. Z. 2005a. Non-neural ectoderm is really neural: evolution of developmental patterning mechanisms in the non-neural ectoderm of chordates and the problem of sensory cell homologies. J. Exp. Zool. B Mol. Dev. Evol. 304: Holland, N. D. 2005b. Chordates. Curr. Biol. 15: R911 R914. Retinoic acid and chordate origins 403 Hsu, L. C., and Chang, W. C Cloning and characterization of a new functional human aldehyde dehydrogenase gene. J. Biol. Chem. 266: Hsu, L. C., Chang, W. C., Hoffmann, I., and Duester, G Molecular analysis of two closely related mouse aldehyde dehydrogenase genes: identification of a role for Aldh1, but not Aldh-pb, in the biosynthesis of retinoic acid. Biochem. J. 339 (Part 2): Kumar, S., Tamura, K., Jakobsen, I. B., and Nei, M MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17: Lin, M., and Napoli, J. L cdna cloning and expression of a human aldehyde dehydrogenase (ALDH) active with 9-cis-retinal and identification of a rat ortholog, ALDH12. J. Biol. Chem. 275: Manzanares, M., et al Conservation and elaboration of Hox gene regulation during evolution of the vertebrate head. Nature 408: Marshall, H., et al A conserved retinoic acid response element required for early expression of the homeobox gene Hoxb-1. Nature 370: Nagatomo, K., and Fujiwara, S Expression of Raldh2, Cyp26 and Hox-1 in normal and retinoic acid-treated Ciona intestinalis embryos. Gene. Exp. Patterns 3: Nakai, K., and Horton, P PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem. Sci. 24: Nakai, K., and Kanehisa, M A knowledge base for predicting protein localization sites in eukaryotic cells. Genomics 14: Niederreither, K., et al Genetic evidence that oxidative derivatives of retinoic acid are not involved in retinoid signaling during mouse development. Nat. Genet. 31: Perozich, J., Nicholas, H., Wang, B. C., Lindahl, R., and Hempel, J Relationships within the aldehyde dehydrogenase extended family. Protein Sci. 8: Reijntjes, S., Blentic, A., Gale, E., and Maden, M The control of morphogen signalling: regulation of the synthesis and catabolism of retinoic acid in the developing embryo. Dev. Biol. 285: Rokas, A., and Holland, P. W Rare genomic changes as a tool for phylogenetics. Trends Ecol. Evol. 15: Ross, S. A., McCaffery, P. J., Drager, U. C., and De Luca, L. M Retinoids in embryonal development. Physiol. Rev. 80: Rzhetsky,A.,Ayala,F.J.,Hsu,L.C.,Chang,C.,andYoshida,A Exon/intron structure of aldehyde dehydrogenase genes supports the introns-late theory. Proc.Natl.Acad.Sci.USA94: Schilling, T. F., and Knight, R. D Origins of anteroposterior patterning and Hox gene regulation during chordate evolution. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 356: Schmidt, H. A., Strimmer, K., Vingron, M., and von Haeseler, A TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18: Sciarrino, S., and Matranga, V Effects of retinoic acid and dimethylsulfoxide on the morphogenesis of the sea urchin embryo. Cell Biol. Int. Rep. 19: Shimeld, S. M Retinoic acid, hox genes and the anterior-posterior axis in chordates. BioEssays 18: Sockanathan, S., and Jessell, T. M Motor neuron-derived retinoid signaling specifies the subtype identity of spinal motor neurons. Cell 94: Sophos, N. A., and Vasiliou, V Aldehyde dehydrogenase gene superfamily: the 2002 update. Chem. Biol. Interact : Suzuki, R., et al Identification of RALDH-3, a novel retinaldehyde dehydrogenase, expressed in the ventral region of the retina. Mech. Dev. 98: Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F., and Higgins, D. G The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25: Tsukui, T., et al Multiple left-right asymmetry defects in Shh(-/-) mutant mice unveil a convergence of the shh and retinoic acid pathways in the control of Lefty-1. Proc.Natl.Acad.Sci.USA96:

12 404 EVOLUTION&DEVELOPMENT Vol. 8, No. 5, September^October 2006 Vasiliou, V., Bairoch, A., Tipton, K. F., and Nebert, D. W Eukaryotic aldehyde dehydrogenase (ALDH) genes: human polymorphisms, and recommended nomenclature based on divergent evolution and chromosomal mapping. Pharmacogenetics 9: Vermot, J., and Pourquie, O Retinoic acid coordinates somitogenesis and left right patterning in vertebrate embryos. Nature 435: Wada, H Origin and evolution of the neural crest: a hypothetical reconstruction of its evolutionary history. Dev Growth Differ 43: Wall,D.P.,Fraser,H.B.,andHirsh,A.E.2003.Detectingputativeorthologs. Bioinformatics 19: Yoshida, A., Rzhetsky, A., Hsu, L. C., and Chang, C Human aldehyde dehydrogenase gene family. Eur. J. Biochem. 251: APPENDIX Table A1. Accession numbers and information related to the Aldh sequences used in this study Species genome, size (coverage) Name Accession number 1 Genomic information Saccoglossus kowalevskii, Mb ( ) Aldh2 This work F Aldh1a1/2/3a This work F Aldh1a1/2/3b This work F Aldh1a1/2/3c This work F Aldh1a1/2/3d This work F Aldh1a1/2/3e This work F Strongylocentrotus purpuratus, Mb (6x) Aldh2a This work and XP_ Trace file database and NW_ Aldh2b This work and XP_ Trace file database and NW_ Aldh1l This work and XP_ Trace file database Oikopleura dioca, 2 72 Mb (9x) Aldh2 This work Trace file database Ciona intestinalis, 3,4 160 Mb (8.2x) Aldh2 ci sc 184 Aldh1a1/2/3a ci sc18 Aldh1a1/2/3b ci sc 112 Aldh1a1/2/3c ci sc 112 Aldh1a1/2/3d ci sc 112 Aldh1l ci sc 1 Ciona savignyi, Mb (13x) Aldh2 SINCSIG sc 360 Aldh1a1/2/3a SINCSIG sc 329 Aldh1a1/2/3bc1 SINCSIG sc 76 Aldh1a1/2/3bc2 SINCSIG sc 76 Branchiostoma floridae, Mb (13x) Aldh2 This work Trace file database Aldh1a1/2/3a This work Trace file database Aldh1a1/2/3b This work Trace file database Aldh1a1/2/3c This work Trace file database Aldh1a1/2/3d This work Trace file database Aldh1a1/2/3e This work Trace file database Aldh1a1/2/3f This work Trace file database Aldh1l This work Trace file database Danio rerio, Mb (6.5 7x) Aldh2a NP_ LG5, Zv5_scaffold1383 Aldh2b NP_ LG5, Zv5_scaffold1383 Aldh1a2 NP_ LG7 Aldh1a3 This work (DQ300198) Zv5_sc1492, and NA2068 Aldh1l1 XP_ NW_ Aldh1l2 XP_ NW_ Takifugu rubripes, Mb (5.7x) Aldh2 SINFRUP sc 3571 Aldh1a2 BAE20172 sc 233 Aldh1a3 This work sc 1420 and sc 4033 Aldh1l SINFRUP sc 1384 Aldh1l2 SINFRUP sc 1786 Xenopus tropicalis, Mb (7.65x) Aldh2 fgenesh1_pg.c_scaffold_ sc 501 Aldh1b1 fgenesh1_pm.c_scaffold_ sc 153 Aldh1a1 fgenesh1_pg.c_scaffold_ sc 982 Aldh1a2 fgenesh1_pg.c_scaffold_ sc 297 Aldh1a3 fgenesh1_pg.c_scaffold_ sc 208 Aldh1l1 fgenesh1_kg.c_scaffold_ sc 368 Aldh1l2 fgenesh1_pg.c_scaffold_ sc 434

Title slide (1) Tree of life 1891 Ernst Haeckel, Title on left

Title slide (1) Tree of life 1891 Ernst Haeckel, Title on left MDIBL talk July 14, 2005 The Evolution of Cytochrome P450 in animals. Title slide (1) Tree of life 1891 Ernst Haeckel, Title on left My opening slide is a collage (2) containing 35 eukaryotic species with

More information

GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny

GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny Phylogenetics and chromosomal synteny of the GATAs 1273 GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny CHUNJIANG HE, HANHUA CHENG* and RONGJIA ZHOU* Department

More information

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis 10 December 2012 - Corrections - Exercise 1 Non-vertebrate chordates generally possess 2 homologs, vertebrates 3 or more gene copies; a Drosophila

More information

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class

More information

Chapter 18 Lecture. Concepts of Genetics. Tenth Edition. Developmental Genetics

Chapter 18 Lecture. Concepts of Genetics. Tenth Edition. Developmental Genetics Chapter 18 Lecture Concepts of Genetics Tenth Edition Developmental Genetics Chapter Contents 18.1 Differentiated States Develop from Coordinated Programs of Gene Expression 18.2 Evolutionary Conservation

More information

Supplemental Figure 1.

Supplemental Figure 1. Supplemental Material: Annu. Rev. Genet. 2015. 49:213 42 doi: 10.1146/annurev-genet-120213-092023 A Uniform System for the Annotation of Vertebrate microrna Genes and the Evolution of the Human micrornaome

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

From DNA to Diversity

From DNA to Diversity From DNA to Diversity Molecular Genetics and the Evolution of Animal Design Sean B. Carroll Jennifer K. Grenier Scott D. Weatherbee Howard Hughes Medical Institute and University of Wisconsin Madison,

More information

1 ATGGGTCTC 2 ATGAGTCTC

1 ATGGGTCTC 2 ATGAGTCTC We need an optimality criterion to choose a best estimate (tree) Other optimality criteria used to choose a best estimate (tree) Parsimony: begins with the assumption that the simplest hypothesis that

More information

Carvalho et al. BMC Evolutionary Biology (2017) 17:24 DOI /s

Carvalho et al. BMC Evolutionary Biology (2017) 17:24 DOI /s Carvalho et al. BMC Evolutionary Biology (2017) 17:24 DOI 10.1186/s12862-016-0863-1 RESEARCH ARTICLE Open Access Lineage-specific duplication of amphioxus retinoic acid degrading enzymes (CYP26) resulted

More information

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Comparative genomics and proteomics Species available Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Vertebrates: human, chimpanzee, mouse, rat,

More information

5/4/05 Biol 473 lecture

5/4/05 Biol 473 lecture 5/4/05 Biol 473 lecture animals shown: anomalocaris and hallucigenia 1 The Cambrian Explosion - 550 MYA THE BIG BANG OF ANIMAL EVOLUTION Cambrian explosion was characterized by the sudden and roughly simultaneous

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

Chapter 16: Reconstructing and Using Phylogenies

Chapter 16: Reconstructing and Using Phylogenies Chapter Review 1. Use the phylogenetic tree shown at the right to complete the following. a. Explain how many clades are indicated: Three: (1) chimpanzee/human, (2) chimpanzee/ human/gorilla, and (3)chimpanzee/human/

More information

Classification and Phylogeny

Classification and Phylogeny Classification and Phylogeny The diversity of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize without a scheme

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Hillis DM Inferring complex phylogenies. Nature 383:

Hillis DM Inferring complex phylogenies. Nature 383: Hillis DM. 1996. Inferring complex phylogenies. Nature 383: 130-131. Triangles: parsimony Squares: neighbor-joining (under specified model) Circles: UPGMA Designing your phylogenetic analysis Choice of

More information

Classification and Phylogeny

Classification and Phylogeny Classification and Phylogeny The diversity it of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize without a scheme

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

8/23/2014. Introduction to Animal Diversity

8/23/2014. Introduction to Animal Diversity Introduction to Animal Diversity Chapter 32 Objectives List the characteristics that combine to define animals Summarize key events of the Paleozoic, Mesozoic, and Cenozoic eras Distinguish between the

More information

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26 Phylogeny Chapter 26 Taxonomy Taxonomy: ordered division of organisms into categories based on a set of characteristics used to assess similarities and differences Carolus Linnaeus developed binomial nomenclature,

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

What is Phylogenetics

What is Phylogenetics What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011

PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION Integrative Biology 200B Spring 2011 "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 Evolution and development ("evo-devo") The last frontier in our understanding of biological forms is an understanding

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

PHYLOGENY & THE TREE OF LIFE

PHYLOGENY & THE TREE OF LIFE PHYLOGENY & THE TREE OF LIFE PREFACE In this powerpoint we learn how biologists distinguish and categorize the millions of species on earth. Early we looked at the process of evolution here we look at

More information

Comparing Genomes! Homologies and Families! Sequence Alignments!

Comparing Genomes! Homologies and Families! Sequence Alignments! Comparing Genomes! Homologies and Families! Sequence Alignments! Allows us to achieve a greater understanding of vertebrate evolution! Tells us what is common and what is unique between different species

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

Classification, Phylogeny yand Evolutionary History

Classification, Phylogeny yand Evolutionary History Classification, Phylogeny yand Evolutionary History The diversity of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize

More information

Molecular evolution. Joe Felsenstein. GENOME 453, Autumn Molecular evolution p.1/49

Molecular evolution. Joe Felsenstein. GENOME 453, Autumn Molecular evolution p.1/49 Molecular evolution Joe Felsenstein GENOME 453, utumn 2009 Molecular evolution p.1/49 data example for phylogeny inference Five DN sequences, for some gene in an imaginary group of species whose names

More information

Hands-On Nine The PAX6 Gene and Protein

Hands-On Nine The PAX6 Gene and Protein Hands-On Nine The PAX6 Gene and Protein Main Purpose of Hands-On Activity: Using bioinformatics tools to examine the sequences, homology, and disease relevance of the Pax6: a master gene of eye formation.

More information

Genomes and Their Evolution

Genomes and Their Evolution Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

3/8/ Complex adaptations. 2. often a novel trait

3/8/ Complex adaptations. 2. often a novel trait Chapter 10 Adaptation: from genes to traits p. 302 10.1 Cascades of Genes (p. 304) 1. Complex adaptations A. Coexpressed traits selected for a common function, 2. often a novel trait A. not inherited from

More information

Animal Origins and Evolution

Animal Origins and Evolution Animal Origins and Evolution Common Features of Animals multicellular heterotrophic motile Sexual reproduction, embryo Evolution of Animals All animals are multicellular and heterotrophic, which means

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION doi:1.138/nature1237 a b retinol retinal RA OH RDH (retinol dehydrogenase) O H Raldh2 O R/R.6.4.2 (retinaldehyde dehydrogenase 2) RA retinal retinol..1.1 1 Concentration (nm)

More information

Questions in developmental biology. Differentiation Morphogenesis Growth/apoptosis Reproduction Evolution Environmental integration

Questions in developmental biology. Differentiation Morphogenesis Growth/apoptosis Reproduction Evolution Environmental integration Questions in developmental biology Differentiation Morphogenesis Growth/apoptosis Reproduction Evolution Environmental integration Representative cell types of a vertebrate zygote => embryo => adult differentiation

More information

PHYLOGENY AND SYSTEMATICS

PHYLOGENY AND SYSTEMATICS AP BIOLOGY EVOLUTION/HEREDITY UNIT Unit 1 Part 11 Chapter 26 Activity #15 NAME DATE PERIOD PHYLOGENY AND SYSTEMATICS PHYLOGENY Evolutionary history of species or group of related species SYSTEMATICS Study

More information

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona (tgabaldon@crg.es) http://gabaldonlab.crg.es Homology the same organ in different animals under

More information

Concepts and Methods in Molecular Divergence Time Estimation

Concepts and Methods in Molecular Divergence Time Estimation Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks

More information

Chapter 26 Phylogeny and the Tree of Life

Chapter 26 Phylogeny and the Tree of Life Chapter 26 Phylogeny and the Tree of Life Chapter focus Shifting from the process of how evolution works to the pattern evolution produces over time. Phylogeny Phylon = tribe, geny = genesis or origin

More information

Letter to the Editor. Temperature Hypotheses. David P. Mindell, Alec Knight,? Christine Baer,$ and Christopher J. Huddlestons

Letter to the Editor. Temperature Hypotheses. David P. Mindell, Alec Knight,? Christine Baer,$ and Christopher J. Huddlestons Letter to the Editor Slow Rates of Molecular Evolution Temperature Hypotheses in Birds and the Metabolic Rate and Body David P. Mindell, Alec Knight,? Christine Baer,$ and Christopher J. Huddlestons *Department

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega BLAST Multiple Sequence Alignments: Clustal Omega What does basic BLAST do (e.g. what is input sequence and how does BLAST look for matches?) Susan Parrish McDaniel College Multiple Sequence Alignments

More information

v Scientists have identified 1.3 million living species of animals v The definition of an animal

v Scientists have identified 1.3 million living species of animals v The definition of an animal Biosc 41 9/10 Announcements BIOSC 041 v Genetics review: group problem sets Groups of 3-4 Correct answer presented to class = 2 pts extra credit Incorrect attempt = 1 pt extra credit v Lecture: Animal

More information

BIOINFORMATICS: An Introduction

BIOINFORMATICS: An Introduction BIOINFORMATICS: An Introduction What is Bioinformatics? The term was first coined in 1988 by Dr. Hwa Lim The original definition was : a collective term for data compilation, organisation, analysis and

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

More information

Biosc 41 9/10 Announcements

Biosc 41 9/10 Announcements Biosc 41 9/10 Announcements v Genetics review: group problem sets Groups of 3-4 Correct answer presented to class = 2 pts extra credit Incorrect attempt = 1 pt extra credit v Lecture: Animal Body Plans

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

More information

Outline. v Definition and major characteristics of animals v Dividing animals into groups based on: v Animal Phylogeny

Outline. v Definition and major characteristics of animals v Dividing animals into groups based on: v Animal Phylogeny BIOSC 041 Overview of Animal Diversity: Animal Body Plans Reference: Chapter 32 Outline v Definition and major characteristics of animals v Dividing animals into groups based on: Body symmetry Tissues

More information

Homeotic Genes and Body Patterns

Homeotic Genes and Body Patterns Homeotic Genes and Body Patterns Every organism has a unique body pattern. Although specialized body structures, such as arms and legs, may be similar in makeup (both are made of muscle and bone), their

More information

Section 4 Professor Donald McFarlane

Section 4 Professor Donald McFarlane Characteristics Section 4 Professor Donald McFarlane Lecture 11 Animals: Origins and Bauplans Multicellular heterotroph Cells lack cell walls Most have nerves, muscles, capacity to move at some point in

More information

Chapter 26: Phylogeny and the Tree of Life

Chapter 26: Phylogeny and the Tree of Life Chapter 26: Phylogeny and the Tree of Life 1. Key Concepts Pertaining to Phylogeny 2. Determining Phylogenies 3. Evolutionary History Revealed in Genomes 1. Key Concepts Pertaining to Phylogeny PHYLOGENY

More information

9/4/2015 INDUCTION CHAPTER 1. Neurons are similar across phyla Thus, many different model systems are used in developmental neurobiology. Fig 1.

9/4/2015 INDUCTION CHAPTER 1. Neurons are similar across phyla Thus, many different model systems are used in developmental neurobiology. Fig 1. INDUCTION CHAPTER 1 Neurons are similar across phyla Thus, many different model systems are used in developmental neurobiology Fig 1.1 1 EVOLUTION OF METAZOAN BRAINS GASTRULATION MAKING THE 3 RD GERM LAYER

More information

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression) Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

Comparative / Evolutionary Genomics

Comparative / Evolutionary Genomics Canestro et al 2003 Genome Biology Comparative / Evolutionary Genomics What processes have shaped metazoan genomes? What genes are responsible for anatomical & physiological differences among metazoan

More information

Graph Alignment and Biological Networks

Graph Alignment and Biological Networks Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale

More information

Chapter # EVOLUTION AND ORIGIN OF NEUROFIBROMIN, THE PRODUCT OF THE NEUROFIBROMATOSIS TYPE 1 (NF1) TUMOR-SUPRESSOR GENE

Chapter # EVOLUTION AND ORIGIN OF NEUROFIBROMIN, THE PRODUCT OF THE NEUROFIBROMATOSIS TYPE 1 (NF1) TUMOR-SUPRESSOR GENE 142 Part 5 Chapter # EVOLUTION AND ORIGIN OF NEUROFIBROMIN, THE PRODUCT OF THE NEUROFIBROMATOSIS TYPE 1 (NF1) TUMOR-SUPRESSOR GENE Golovnina K. *1, Blinov A. 1, Chang L.-S. 2 1 Institute of Cytology and

More information

PGA: A Program for Genome Annotation by Comparative Analysis of. Maximum Likelihood Phylogenies of Genes and Species

PGA: A Program for Genome Annotation by Comparative Analysis of. Maximum Likelihood Phylogenies of Genes and Species PGA: A Program for Genome Annotation by Comparative Analysis of Maximum Likelihood Phylogenies of Genes and Species Paulo Bandiera-Paiva 1 and Marcelo R.S. Briones 2 1 Departmento de Informática em Saúde

More information

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Toni Gabaldón Contact: tgabaldon@crg.es Group website: http://gabaldonlab.crg.es Science blog: http://treevolution.blogspot.com

More information

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together SPECIATION Origin of new species=speciation -Process by which one species splits into two or more species, accounts for both the unity and diversity of life SPECIES BIOLOGICAL CONCEPT Population or groups

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Comparative Bioinformatics Midterm II Fall 2004

Comparative Bioinformatics Midterm II Fall 2004 Comparative Bioinformatics Midterm II Fall 2004 Objective Answer, part I: For each of the following, select the single best answer or completion of the phrase. (3 points each) 1. Deinococcus radiodurans

More information

Visit to BPRC. Data is crucial! Case study: Evolution of AIRE protein 6/7/13

Visit to BPRC. Data is crucial! Case study: Evolution of AIRE protein 6/7/13 Visit to BPRC Adres: Lange Kleiweg 161, 2288 GJ Rijswijk Utrecht CS à Den Haag CS 9:44 Spoor 9a, arrival 10:22 Den Haag CS à Delft 10:28 Spoor 1, arrival 10:44 10:48 Delft Voorzijde à Bushalte TNO/Lange

More information

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family Review: Gene Families Gene Families part 2 03 327/727 Lecture 8 What is a Case study: ian globin genes Gene trees and how they differ from species trees Homology, orthology, and paralogy Last tuesday 1

More information

How should we organize the diversity of animal life?

How should we organize the diversity of animal life? How should we organize the diversity of animal life? The difference between Taxonomy Linneaus, and Cladistics Darwin What are phylogenies? How do we read them? How do we estimate them? Classification (Taxonomy)

More information

Supplementary Information

Supplementary Information Supplementary Information Supplementary Figure 1. Schematic pipeline for single-cell genome assembly, cleaning and annotation. a. The assembly process was optimized to account for multiple cells putatively

More information

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9 Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic

More information

Cubic Spline Interpolation Reveals Different Evolutionary Trends of Various Species

Cubic Spline Interpolation Reveals Different Evolutionary Trends of Various Species Cubic Spline Interpolation Reveals Different Evolutionary Trends of Various Species Zhiqiang Li 1 and Peter Z. Revesz 1,a 1 Department of Computer Science, University of Nebraska-Lincoln, Lincoln, NE,

More information

A SINE in the genome of the cephalochordate amphioxus is an Alu element

A SINE in the genome of the cephalochordate amphioxus is an Alu element Int. J. Biol. Sci. 2006, 2 61 Research paper International Journal of Biological Sciences ISSN 1449-2288 www.biolsci.org 2006 2(2):61-65 2006 Ivyspring International Publisher. All rights reserved A SINE

More information

Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene Prediction Errors

Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene Prediction Errors Genes 2011, 2, 449-501; doi:10.3390/genes2030449 Article OPEN ACCESS genes ISSN 2073-4425 www.mdpi.com/journal/genes Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene

More information

Dynamic evolution of the GnRH receptor gene family in vertebrates

Dynamic evolution of the GnRH receptor gene family in vertebrates Williams et al. BMC Evolutionary Biology 2014, 14:215 RESEARCH ARTICLE Open Access Dynamic evolution of the GnRH receptor gene family in vertebrates Barry L Williams 1,2, Yasuhisa Akazome 3, Yoshitaka

More information

Small RNA in rice genome

Small RNA in rice genome Vol. 45 No. 5 SCIENCE IN CHINA (Series C) October 2002 Small RNA in rice genome WANG Kai ( 1, ZHU Xiaopeng ( 2, ZHONG Lan ( 1,3 & CHEN Runsheng ( 1,2 1. Beijing Genomics Institute/Center of Genomics and

More information

Animal Diversity. Animals are multicellular, heterotrophic eukaryotes with tissues that develop from embryonic layers 9/20/2017

Animal Diversity. Animals are multicellular, heterotrophic eukaryotes with tissues that develop from embryonic layers 9/20/2017 Animal Diversity Chapter 32 Which of these organisms are animals? Animals are multicellular, heterotrophic eukaryotes with tissues that develop from embryonic layers Animals share the same: Nutritional

More information

TE content correlates positively with genome size

TE content correlates positively with genome size TE content correlates positively with genome size Mb 3000 Genomic DNA 2500 2000 1500 1000 TE DNA Protein-coding DNA 500 0 Feschotte & Pritham 2006 Transposable elements. Variation in gene numbers cannot

More information

Multiple Sequence Alignment. Sequences

Multiple Sequence Alignment. Sequences Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe

More information

18.4 Embryonic development involves cell division, cell differentiation, and morphogenesis

18.4 Embryonic development involves cell division, cell differentiation, and morphogenesis 18.4 Embryonic development involves cell division, cell differentiation, and morphogenesis An organism arises from a fertilized egg cell as the result of three interrelated processes: cell division, cell

More information

Homology and Information Gathering and Domain Annotation for Proteins

Homology and Information Gathering and Domain Annotation for Proteins Homology and Information Gathering and Domain Annotation for Proteins Outline Homology Information Gathering for Proteins Domain Annotation for Proteins Examples and exercises The concept of homology The

More information

A Phylogenetic Network Construction due to Constrained Recombination

A Phylogenetic Network Construction due to Constrained Recombination A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer

More information

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5. Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony

More information

Sequence Alignment Techniques and Their Uses

Sequence Alignment Techniques and Their Uses Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this

More information

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi)

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi) Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction Lesser Tenrec (Echinops telfairi) Goals: 1. Use phylogenetic experimental design theory to select optimal taxa to

More information

Workshop: The Evolution of Animalia body symmetry embryonic germ layers ontogenetic origins I. What is an Animal? II. Germ Layers

Workshop: The Evolution of Animalia body symmetry embryonic germ layers ontogenetic origins I. What is an Animal? II. Germ Layers Workshop: The Evolution of Animalia by Dana Krempels Perhaps even more than the other Eukarya, Animalia is characterized by a distinct progression of complexity in form and function as one moves from the

More information

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES HOW CAN BIOINFORMATICS BE USED AS A TOOL TO DETERMINE EVOLUTIONARY RELATIONSHPS AND TO BETTER UNDERSTAND PROTEIN HERITAGE?

More information

Estimating Evolutionary Trees. Phylogenetic Methods

Estimating Evolutionary Trees. Phylogenetic Methods Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

Homolog. Orthologue. Comparative Genomics. Paralog. What is Comparative Genomics. What is Comparative Genomics

Homolog. Orthologue. Comparative Genomics. Paralog. What is Comparative Genomics. What is Comparative Genomics Orthologue Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Normally, orthologs retain the same function in the course of evolution. Identification of orthologs

More information

Phylogenetic analysis. Characters

Phylogenetic analysis. Characters Typical steps: Phylogenetic analysis Selection of taxa. Selection of characters. Construction of data matrix: character coding. Estimating the best-fitting tree (model) from the data matrix: phylogenetic

More information

Biased amino acid composition in warm-blooded animals

Biased amino acid composition in warm-blooded animals Biased amino acid composition in warm-blooded animals Guang-Zhong Wang and Martin J. Lercher Bioinformatics group, Heinrich-Heine-University, Düsseldorf, Germany Among eubacteria and archeabacteria, amino

More information

Building the brain (1): Evolutionary insights

Building the brain (1): Evolutionary insights Building the brain (1): Evolutionary insights Historical considerations! Initial insight into the general role of the brain in human behaviour was already attained in antiquity and formulated by Hippocrates

More information

2 Genome evolution: gene fusion versus gene fission

2 Genome evolution: gene fusion versus gene fission 2 Genome evolution: gene fusion versus gene fission Berend Snel, Peer Bork and Martijn A. Huynen Trends in Genetics 16 (2000) 9-11 13 Chapter 2 Introduction With the advent of complete genome sequencing,

More information

Microbial Taxonomy and the Evolution of Diversity

Microbial Taxonomy and the Evolution of Diversity 19 Microbial Taxonomy and the Evolution of Diversity Copyright McGraw-Hill Global Education Holdings, LLC. Permission required for reproduction or display. 1 Taxonomy Introduction to Microbial Taxonomy

More information

Developmental Biology Lecture Outlines

Developmental Biology Lecture Outlines Developmental Biology Lecture Outlines Lecture 01: Introduction Course content Developmental Biology Obsolete hypotheses Current theory Lecture 02: Gametogenesis Spermatozoa Spermatozoon function Spermatozoon

More information