Bioinformatics Report Branchiostoma lanceolatum dopamine D 1 / receptor protein phylogenetic analysis Alanna Lewis 0
Abstract: Dopamine is an essential neurotransmitter for many species of chordates. The family of transmembrane receptors which serve as binding sites for dopamine in order to produce important signalling cascades are known as dopamine - receptors, the most common of which are D 1 and D 2. We analysed the conservation of protein sequence across various taxa for comparison with Branchiostoma lanceolatum, which possesses the least complex nervous system of the species examined. Through multiple sequence alignment and phylogenetic analysis, it is suspected that as complex terrestrial locomotion began and organisms diversified onto land, sequence domains were added to the original structure of the D 1 -receptor observed in B. lanceolatum. This occurred along with the development of the nervous system necessary to compensate for such lifestyles. High conservation of protein structure was observed even between general protein models and that of human 2-adrenergic G protein-coupled receptors. Introduction: Dopamine is a biogenic catecholamine which serves as an important neurotransmitter in many organisms (Burman et al., 2009). It is crucial for the control of locomotion, cognitive processes, food intake and endocrine regulation, and plays a role in many other physiological functions (Missale et al., 1998). As dopamine is a critical compound in both vertebrate and invertebrate taxa, dopamine receptors are believed to be present across a wide range of organisms. Although five subfamilies of dopamine receptor have been identified, D 1 and D 2 have been found to be the most prevalent to date (Burman et al., 2010; Missale et al., 1998). Dopamine receptors are transmembrane G protein-coupled receptors, with a high level of conservation within transmembrane domains and across families (Burman et al., 2009). The 1
structure of class D1 (Figure 1) and class D2 dopamine receptors varies only slightly, with resultant variation in ligand affinity and effector coupling between classes (Burman et al., 2009). Various classes of dopamine receptors are distributed throughout the body of an organism in areas including the brain, pituitary blood vessels and kidney and are involved in numerous signalling cascades (Missale et al., 1998). One of the main signal transduction pathways that the D 1 family of receptors is involved in impacts the activation of adenylyl cyclase. The receptor couples to the Gas G protein to activate adenylyl cyclase, resulting in an increase in camp (Burman et al., 2009). D 1 receptors are also involved in the regulation of intracellular calcium concentrations directly through the stimulation of the hydrolysis of phosphatidylinositol by phospholipase C, although this has not been demonstrated conclusively in mammalian taxa (Missale et al., 1998). Studies conducted by Burman et al., (2009; 2010) have experimentally demonstrated the conservation of function of dopamine D 1 like b receptors from lancelets; the most basal classification of chordates to those expressed in various mammals. Based on this research and the high level of conservation within transmembrane domains seen in all five classes of dopamine receptors (Missale et al., 1998), we can expect that phylogenetic analysis of species exhibiting homologous protein sequences will demonstrate close relationships even across diverse taxa. We can also expect a high level of conservation of sequences when analysing the composition of homologous proteins across various species. Finally, because of the high expected proportion of sequence conservation, we can expect the homologous proteins identified across various taxa to exhibit a structure similar to that identified by Missale et al. (1998) (Figure 1). 2
Methods: A basic local alignment search was performed on an unknown gene sequence using the nucleotide BLAST tools available through the National Centre for Biotechnology Information (NCBI) website (http://blast.ncbi.nlm.nih.gov/blast.cgi). The sequence with the highest total score and E-value was designated as the identity of the unknown sequence. This information was used to identify the species that possessed the unknown sequence. The respective protein sequence was obtained from the NCBI records. A protein BLAST was performed through the NCBI website on the obtained amino acid sequence, and protein homologues for the gene product were obtained across a variety of taxa. The amino acid sequences of homologous proteins in 25 additional species (Table 1) exhibiting an E-value less than 1 x 10-10 were obtained from the results of the BLAST. These sequences were compiled and a Clustal W2 sequence analysis was performed using the resources provided by the European Bioinformatics Institute (http://www.ebi.ac.uk). The results of the Clustal W2 analysis were used to further analyse the homology of the 25 additional sequences, and refine the phylogenetic tree. The Jalview program also offered by the European Bioinformatics Institute was used to identify regions of the homologous sequences that were conserved between the various species. The program MEGA offered by the College of Biological Sciences at the University of Guelph was used to create a multiple alignment using Clustal W parameters. A phylogenetic tree was produced also using MEGA software through bootstrap analysis of 1000 replicates using the neighbour joining method. The primary, secondary and tertiary structures of the original protein were obtained using the ModBase database of comparative protein structure models offered by the University of California (http://modbase.compbio.ucsf.edu/modbase-cgi/index.cgi). Protein 3
structures were further predicted using a SCRATCH protein predictor tool available through the ExPASy Proteomics Server available online (http://www.expasy.org/tools/#primary). Results: The protein that is encoded by the unknown gene sequence was determined to be dopamine D 1 / receptor, specifically found in Branchiostoma lanceolatum. The DNA sequence (Figure 2a) and translation of the resultant amino acid product (Figure 2b) were determined. The number of amino acids does not align with the number of codons in the gene sequence indicating an open reading frame. The SCRATCH protein predictor projected a secondary structure of: CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHCHEEEEEEEECCCCCCHHHHH HHHHHHHHHHHHHECCCHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEEEECCCCCCCCCCHHH HHHHHHHHHHHHHHHCCCCEEEEECCCCCCCCCEECCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH HHHHHHHHHHCCCCCCCCCCCCCHHCCHHHHHHHHHHHHHHHHEEEEECHHHHHHHHHHCCCCCCCHHHHHHHHHHH HHHHCCCHHHHECCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCEEECCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCC The sequence was found to possess five disulphide bonds, with cysteines at positions 28, 66, 195, 197, 278, 292, 313, 334, 335, and 338. The tertiary structure of the protein was determined and is composed primarily of -helices (Figure 3). Because the protein structure is predicted for Branchiostoma lanceolatum and the sequence itself is based on unpublished research, only the tertiary structure for this species can be obtained to date. The tertiary structure is based on the crystallized structure of human -2 adrenergic G protein-coupled receptor (Figure 4). Multiple alignment results of the various protein sequences demonstrated a high proportion of conservation throughout the length of the sequences across taxa (Figure 5). Regions which demonstrated little or no sequence conservation between species were observed 4
in amino acid regions 1-28, 44-59, 197-213, 325-337, and 396-494. There was an observed trend for lower interspecies sequence conservation in the bony fishes and common lancelet (B. lanceolatum) when compared to the conservation between the rest of the species examined (Figure 6). The goldfish (Carassius auratus), grass carp (Ctenopharyngodon idella) and European sea bass (Dicentrarchus labrax) had the highest level of homology when compared to the common lancelet (B. lanceolatum). The phylogenetic tree predicting possible evolutionary history of the dopamine D 1 / receptor (Figure 6) was developed in both MEGA and by Clustal W2 sequence alignment (Figure 7). High discrepancies between the two phylogenetic trees were observed. Although no bootstrap analysis was performed in order to obtain the tree generated through Clustal W2, this tree is more consistent with expected evolutionary history. This is believed to be a more accurate demonstration of phylogenetic analysis of this protein because the grouping of species is closer to the currently understood taxa. Discussion: Branchiostoma lanceolatum, the organism from which this particular dopamine D 1 / receptor sequence was obtained, is considered to be one of the most basal of chordate organisms (Burman et al., 2009). This analysis confirmed the highly conserved nature of dopamine D 1 / receptor across a broad range of taxa. All species found to possess homologues of the initial protein sequence were members of the phylum Chordata, which relates to the function of the protein as previously discussed. The sequence structure in B. lanceolatum was found to be one of the least homologous sequences when compared to 25 other species in a multiple sequence alignment however. This finding possibly mirrors the results found in the evolution of synapsin proteins, which are involved in neuronal functions, where protein structure obtained additional 5
domains throughout its evolution in order to allow for increasingly complex nervous systems (Candiani et al., 2010). In the case of dopamine receptors, new domains were probably incorporated into the protein structure in order to develop greater functionality in organisms which were developing increasingly complex cognitive functioning and locomotion as they diversified onto land. This shift from aquatic to terrestrial lifestyle would be dependent on an increasingly complex nervous system and may account for the development of the five varying classes of dopamine -receptors. The incorporation of new domains is demonstrated by the trends seen upon conducting a phylogenetic analysis; where teleosts and aquatic taxa were tightly grouped, and organisms which had diversified to a terrestrial habitat were separate, yet also exhibited close association within various subdivisions of taxa (Figure 7). The aquatic and semi-aquatic species demonstrated a close homology when compared with one another but a low homology when compared to the entire range of taxa analysed. This can also be accounted for through the insertion of new domains in the sequence throughout its evolutionary history, even though conservation of the original sequence remained high. Bootstrap analysis of the phylogenetic tree (Figure 6) indicated high values when examining -receptor sequence homology between members of closely related taxa. This was evident within the fishes and, despite their separate branching patterns and location on the tree, high bootstrap values were also seen within the primates (Figure 6). This is expected as their life history indicates similar means of locomotion, and close relation in brain function and morphology. Lower bootstrap values are seen predominantly in the primary branches, in contrast with the high values in the branches between individual species. This is possibly due to the low number of species from each taxon that was used. In order to improve the branching patterns, more species within taxa as well as from a more diverse selection of taxa would need to be 6
included in the analysis. Many species were omitted due to the exclusion of predicted protein sequences from this preliminary assessment. It is recommended that further examination of the relationships be performed including predicted sequences in order to further the understanding of the modifications that have been made to dopamine D 1 / -receptors and stronger relations can be made between sequence adaptation and evolution of aspects of neurological function. When comparing the tertiary structure of the protein to the general dopamine D 1 class of receptors identified by Missale et al. (1998) (Figure 1), only a predicted tertiary structure could be identified (Figure 3). A closely related protein in humans (human 2-adrenergic G proteincoupled receptor) has been crystallized and was also used for comparison to the general model. Both of the tertiary protein structures obtained paralleled the model in the number and orientation of -helices which would be embedded in the cell membrane. Disulphide bridges were also observed in a similar location in all three structures as ligand binding sites. Further comparison determined the location of extracellular loops to vary only slightly, with the second extracellular loop held out of activity in the human 2-adrenergic G protein-coupled receptor by disulphide bonds also seen in the general structural model (Cherezov et al., 2007; Missale et al., 1998). It is clear upon examination of the general model and predicted B. lanceolatum protein structure in contrast to that of the human 2-adrenergic G protein-coupled receptor, that there has been a structural addition in the human protein that is not present in the other two models. This is possibly the result of the aforementioned addition of various domains throughout evolution of the protein in order to adapt to increasingly complex nervous systems and patterns of locomotion. In order to increase understanding of this structure s evolution, protein structural analysis should be performed across the examined taxa and contrasted with the sequence alignment (Figure 5), to compare sites of domain insertion with structural anomalies. 7
Figures: Figure 1: Generalized structure of class D 1 dopamine / receptor. This transmembrane protein has a channel structure and is composed of 7 -helices. 8
ATGATTACGCCAAGCTATTTAGGTGACACTATAGAATACTCAAGCTATGCATCAAGCTTGGTACCGAGCT CGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCGGCTTGCCACCATGTCGGCGAACACTACGGTC TCTCCCACCGAGACAACCGCTAACCTCACCGCCAATTCCACCGAGGCGTCCGTCGGGTCCTGTTTCGCCC CCAACCCGTACAGTGCCGGGGTCCAGGCCGTTCTAGGTCTGATCACGGTGATTCTCATCCTTCTGACTGT GATAGGTAACGTGTTAGTGATCCTAGCCGTCACCTGCCACCGGAAAATGCGAACTGTGACCAACTTCTTC ATAGTATCGCTGGCTTGTGCAGACCTCAGCGTCGGGATCACCGTCCTGCCCTTCGCTGCCACCAACGACA TCCTCGGCTACTGGCCTTTCGGCGGCTACTGTGACGTCTGGGTGTCGTTCGACGTCCTGAACTCCACGGC GAGTATCCTGAACCTGGTTGTGATCGCATTCGACCGATTCCTCGCCATCACAGCCCCCTTTACCTACCAC ACTCGCATGACGGAGCGAACTGCCGGTATTCTGATCGCGACGGTGTGGGGGATCTCGCTGGTCGTGTCCT TTCTACCCATCCAGGCGGGCTGGTACAGGGACAACCAGTCCGAAGAGGCCTTGGCGATCTATTCGGACCC GTGTTTATGCATCTTCACTGCGAGCACTGCTTACACCATCGTGTCGTCCCTCATATCGTTCTACATACCG CTCCTCATCATGCTTGTGTTCTACGGGATCATCTTCAAGGCAGCCCGAGACCAAGCTCGCAAGATCAACG CTCTGGAAGGGCGTTTAGAGCAGGAAAACAACCGGGGCAAGAAAATATCTCTGGCAAAGGAGAAAAAGGC GGCAAAGACACTAGGCATCATCATGGGAGTGTTTATCCTGTGTTGGTTGCCGTTTTTCGTGGTGAACATT GTGAACCCGTTCTGTGACAGGTGTGTGCAGCCAGCCGTGTTCATCGCGCTCACATGGCTCGGATGGATCA ACTCCTGCTTCAATCCGATCATCTACGCCTTCAACAAGGAGTTCAGGAAGGTCTTCGTAAAGATGATCTG TTGTCACAAGTGCAGAGGTGTGACAGTGGGGCCTAACCACGCAGACTTGAACTACGACCCCGTAGCGATG CGGCTCAAGAAGAGGGGAGAAAACGCCAATGGGACCGTCAACGGCGACGCCAACGGCAAGGCCAACGGCA ACATAGAGGCCGGTGAAGGAACGTCAAGTTCATAAGGTAACGTGTTTAGTTTAGGCGAAAGGGAGCTGGA CTCTCACGAAGACTGAAACTAAAGTTTCTGAGAGTTGATAGAAGAACGGAGGGACAAATTGAACACCACT GCCAATATGAAGTGGATATGTAAGATAGTGCCAGCACAACATCATTATACGAAGCCGAATTCTGCAGATA TCCATCACACTGGCGGCCGCTCGAGCATGCATCTAGAGGCCCAATTCGCCCTATAGTG Figure 2a: The gene sequence which encodes the dopamine D 1 / -receptor protein in Branchiostoma lanceolatum. The sequence is 1528bp in length. MSANTTVSPTETTANLTANSTEASVGSCFAPNPYSAGVQAVLGLITVILILLTVIGNVLVILAVTCHRKM RTVTNFFIVSLACADLSVGITVLPFAATNDILGYWPFGGYCDVWVSFDVLNSTASILNLVVIAFDRFLAI TAPFTYHTRMTERTAGILIATVWGISLVVSFLPIQAGWYRDNQSEEALAIYSDPCLCIFTASTAYTIVSS LISFYIPLLIMLVFYGIIFKAARDQARKINALEGRLEQENNRGKKISLAKEKKAAKTLGIIMGVFILCWL PFFVVNIVNPFCDRCVQPAVFIALTWLGWINSCFNPIIYAFNKEFRKVFVKMICCHKCRGVTVGPNHADL NYDPVAMRLKKRGENANGTVNGDANGKANGNIEAGEGTSSS Figure 2b: The protein sequence for dopamine D 1 / -receptor encoded by the above gene sequence (2a). The protein is 391 AA in length. Its existence is inferred from homology. 9
Figure 3: Predicted tertiary structure of Branchiostoma lanceolatum dopamine D 1 / receptor. The structure is composed of approximately 7 -helices. It is a transmembrane protein, part of the family of G-protein coupled receptors. Figure 4: Tertiary crystal structure of human 2-adrenergic G protein-coupled receptor. Despite the large gap in the phylogeny of B. lanceolatum and Homo sapiens the original structure is highly conserved. Modifications to the protein structure occur most evidently as additions to the primary -helix channel. 10
Figure 5a: Results of a multiple alignment with a total of 26 species across taxa. This section depicts AA 1-111. Coloured regions are conserved between species. Teleosts and B. lanceolatum are depicted in the lower half of the alignment. Figure 5b: A continuation of the multiple alignment. This section depicts AA 112-246. 11
Figure 5c: A continuation of the multiple alignment. This section depicts AA 247-381. Figure 5d: A continuation of the multiple alignment. This section depicts AA 382-494. The separation between aquatic and terrestrial mammal homology is evident in the 409-450 AA region. 12
Figure 6: Phylogenetic tree depicting possible evolutionary history of dopamine D 1 / receptor. For scientific names refer to Table 1. Tree obtained using MEGA software using neighbourjoining method with 1000 replicates to calculate bootstrap values, and a Poisson substitution model. The bootstrap values show strong homology within the aquatic organisms. The bottom branch shows low bootstrap value and does not coincide with expected evolutionary history. 13
Figure 7: Phylogenetic tree depicting possible evolutionary history of dopamine D 1 /b receptor. For scientific names refer to Table 1. Tree obtained from the European Bioinformatics Institute using a Clustal W2 alignment of the sequences. Species names and classifications can be seen in Table 1. 14
Felis catus Canis lupus familiaris Sus scrofa Ovis aries Macaca mulatta Homo sapiens Pan troglodytes Cavia porcellus Rattus norvegicus Mus musculus Tscherskia triton Mesocricetus auratus Xenopus laevis Dicentrarchus labrax Tetraodon nigroviridis Takifugu rubripes Haplochromis burtoni Ctenopharyngodon idella Carassius auratus Oncorhynchus mykiss Branchiostoma lanceolatum Cyprinus carpio Anguilla anguilla Gorilla gorilla Saguinus oedipus Pongo pygmaeus Felidae Canidae Suidae Bovidae Cercopithecidae Hominadae Hominadae Caviidae Muridae Muridae Cricetidae Cricetidae Pipidae Moronidae Tetraodontidae Tetraodontidae Cichlidae Cyprinidae Cyprinidae Salmonidae Branchiostomidae Cyprinidae Anguillidae Hominidae Callitrichidae Hominidae Carnivora Carnivora Artiodactyla Artiodactyla Primates Primates Primates Rodentia Rodentia Rodentia Rodentia Rodentia Anura Perciformes Tetraodontiforme Tetraodontiforme Perciformes Cypriniformes Cypriniformes Salmoniformes Amphioxiformes Cypriniformes Anguilliformes Primates Primates Primates Cat Dog Wild boar Sheep Rhesus-monkey Human Chimpanzee Guinea pig Rat Mouse LT-hamster Syrian-hamster Xenopus Seabass Green-spotted Tiger-puffer Cichlid Grass-carp Goldfish Rainbow-trout Common-lancelet Carp Freshwater eel Gorilla Tamarin Orangutan Tree Name Order Family Genus species Table 1: Reference table indicating all 26 species used in the homologous sequence analysis, as well as classification and scientific names for common names used in phylogenetic tree construction. The species are arranged in order of distribution as determined by bootstrap analysis (Figure 6). Note that all species are members of the phylum Chordata. 15
References: Burman, C., and Evans, P. D. (2010). Amphioxus expresses both vertebrate-type and invertebrate-type dopamine D 1 receptors. Invert. Neurosci. 10 (2): 93-105. doi: 10.1007/s10158-010-0111-0 Burman, C., Reale, V., Srivastava, D. P., and Evans, P. D. (2009). Identification and characterization of a novel amphioxus dopamine D 1 -like receptor. J. Neurochem. 111: 26-36. doi: 10.1111/j.1471-4159.2009.06295.x Candiani, S., Moronti, L., Pennati, R., De Bernardi, F., Benfenati, F., and Pestarino, M. (2010). The synapsin gene family in basal chordates: evolutionary perspectives in metazoans. BMC Evol. Biol. 10 (32): 1471-2148. Cherezov, V., Rosenbaum, D. M., Hanson, M. A., Rasmussen, S. G., Thian, F. S., Kobilka, T. S., Choi, H. J., Khun, P., Weis, W. I., Kobilka, B. K., and Stevens, R. C. (2007). Highresolution crystal structure of an engineered human beta2-adrenergic G protein-coupled receptor. Science. 318:1258-1265. doi: 10.1126/science.1150577 Missale, C., Nash, S. R., Robinson, S. W., Jaber, M., and Caron, M. G. (1998). Dopamine receptors: from structure to function. Physiol. Rev. 78 (1): 189-225. 16