Evolutionary Paths of the camp-dependent Protein Kinase (PKA) Catalytic Subunits

Size: px
Start display at page:

Download "Evolutionary Paths of the camp-dependent Protein Kinase (PKA) Catalytic Subunits"

Transcription

1 Evolutionary Paths of the camp-dependent Protein Kinase (PKA) Catalytic Subunits Kristoffer Søberg 1,2, Tore Jahnsen 2, Torbjørn Rognes 3,4, Bjørn S. Skålhegg 1, Jon K. Laerdahl 4,5 * 1 Department of Nutrition, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway, 2 Department of Biochemistry, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway, 3 Department of Informatics, University of Oslo, Oslo, Norway, 4 Centre for Molecular Biology and Neuroscience (CMBN), Department of Microbiology, Oslo University Hospital Rikshospitalet, Oslo, Norway, 5 Bioinformatics Core Facility, Department of Informatics, University of Oslo, Oslo, Norway Abstract 39,59-cyclic adenosine monophosphate (camp) dependent protein kinase or protein kinase A (PKA) has served as a prototype for the large family of protein kinases that are crucially important for signal transduction in eukaryotic cells. The PKA catalytic subunits Ca and Cb, encoded by the two genes PRKACA and PRKACB, respectively, are among the best understood and characterized human kinases. Here we have studied the evolution of this gene family in chordates, arthropods, mollusks and other animals employing probabilistic methods and show that Ca and Cb arose by duplication of an ancestral PKA catalytic subunit in a common ancestor of vertebrates. The two genes have subsequently been duplicated in teleost fishes. The evolution of the PRKACG retroposon in simians was also investigated. Although the degree of sequence conservation in the PKA Ca/Cb kinase family is exceptionally high, a small set of signature residues defining Ca and Cb subfamilies were identified. These conserved residues might be important for functions that are unique to the Ca or Cb clades. This study also provides a good example of a seemingly simple phylogenetic problem which, due to a very high degree of sequence conservation and corresponding weak phylogenetic signals, combined with problematic nonphylogenetic signals, is nontrivial for state-of-the-art probabilistic phylogenetic methods. Citation: Søberg K, Jahnsen T, Rognes T, Skålhegg BS, Laerdahl JK (2013) Evolutionary Paths of the camp-dependent Protein Kinase (PKA) Catalytic Subunits. PLoS ONE 8(4): e doi: /journal.pone Editor: Narayanaswamy Srinivasan, Indian Institute of Science, India Received December 12, 2012; Accepted March 5, 2013; Published April 12, 2013 Copyright: ß 2013 Søberg et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: The project was funded by the Research Council of Norway. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * j.k.lardahl@medisin.uio.no Introduction Protein kinases are enzymes that catalyze the transfer of a phosphate group from adenosine 59-triphosphate (ATP) to a serine, threonine, tyrosine or other residue on a substrate. Most eukaryotic protein kinases derive from a common ancestor kinase, and share the same core catalytic domain [1]. PKA (EC ) is a serine/threonine kinase which is ubiquitously expressed in the human body. It is involved in many intracellular signaling events, and its function, specificity, and downstream effects depend on factors such as subcellular localization, expression of a number of isoforms and physiochemical features [2,3]. The inactive form of PKA is a heterotetrameric holoenzyme consisting of a regulatory (R) subunit dimer binding to two catalytic (C) subunits [4]. During activation, camp binds cooperatively to two sites termed A and B on each R subunit. In the inactive holoenzyme, only the B site is exposed and available for camp binding. When occupied, this enhances the binding of camp to the A site which leads to an intramolecular conformational change and the release of the R subunit dimer. The two C monomers are then free to phosphorylate relevant C substrates in the cytosol and nucleus [5 7]. Thus, a major function of the R subunit is to inhibit the phosphotransferase activity of the C subunits through direct interactions. Several variants of the C and R subunits have been identified in human cells. Four R subunits designated RIa, RIb, RIIa, and RIIb are transcribed from separate genes [8]. PKA holoenzymes containing RI and RII subunits are designated PKA type I and II, respectively [9,10]. Protein-protein interactions and organization of signal transduction pathways is required to obtain specificity in space and time. Protein kinases are localized to the relevant subcellular sites through anchoring, scaffolding and adapter protein activity. Major organizers of the camp signaling pathway are the A kinase anchoring proteins (AKAPs), which both function as scaffolding proteins and attach PKA to subcellular structures [11,12]. Initially, AKAPs were shown to interact with the RII subunits [13]. Later, dual specific AKAPs binding both RII and RI, as well as AKAPs binding only RI, have been identified, demonstrating that both PKA type I and II may be tethered to subcellular compartments in the cell [14 16]. Five different human C subunit genes have been identified; PRKACA, PRKACB, PRKACG, PRKX, and PRKY [2,17 19], all within the AGC group of kinases, which contains the cyclicnucleotide-dependent family (PKA and PKG), the protein kinase C family, the b-adrenergic receptor kinase, the ribosomal S6 family and some other relatives of these kinases [1,20]. Three of the human C subunit genes, PRKACA, PRKACB, and PRKX, have been demonstrated to be transcribed and translated into functional protein kinases, termed PKA Ca, PKA Cb, and PRKX, respectively. Ca exhibits two splice variants, Ca1 [21] and Ca2 [22 24], by employing two alternative 59 exons in PRKACA (Fig. 1A). Whereas Ca1 is ubiquitously expressed in man, Ca2 is exclusively expressed in the sperm cell and has been shown to be PLOS ONE 1 April 2013 Volume 8 Issue 4 e60935

2 essential for sperm motility and fertilization [24 27]. A number of isoforms of human PRKACB have been identified with alternative splicing of exons 59 of exon 2 (Fig. 1A), encoding at least the following proteins: Cb1, Cb2, Cb3, Cb4, Cb3ab, Cb3b, Cb3abc, Cb4ab, Cb4b, and Cb4abc [28 33]. In addition, Cb variants formed by skipping of exon 4 are expressed in the brain of higher primates. These C subunits bind the R subunit in a campindependent fashion [34]. PRKX, which is a protein kinase encoded from the X- chromosome, also binds RIa in a camp sensitive fashion [19]. PRKX and PRKY are 94% identical and have unknown functions [18,20]. PRKACG is a retroposon lacking introns and may be a pseudogene [35]. Whereas mrna from PRKACG is solely, but ubiquitously, transcribed in the testis [17], the corresponding protein Cc has never been identified. In vitro experiments on expressed Cc have revealed a functional kinase with variantspecific properties. The Cc protein kinase is, in contrast to Ca, and most probably Cb, not inhibited by the Protein Kinase Inhibitor (PKI), and it requires higher concentrations of camp for dissociation from PKA type I holoenzymes [36,37]. The in vivo function of PKA Cc, if the protein exists, remains to be elucidated. The PRKACA and PRKACB genes share identical positions and intron phases for all nine introns, and are very likely the result of a gene duplication event (Fig. 1B). Human PKA Ca1 and Cb1 have the same length (350 residues) and share 93% sequence identity. Many studies have elucidated the importance and function of a number of residues and short sequence segments in these kinases, and some of these are briefly summarized in Fig. 1C. The sequence identity between human PKA Ca vs. Cc and PRKX is, 82% and 54%, respectively, and between Ca/Cb and other human kinases below 50%. PKA C kinases in invertebrate metazoa have sequence identity with human Ca/Cb above 77% (see below), confirming that the PKA Ca/Cb-like kinases, not including PRKX, builds a unique and compact clade/family of kinases within the AGC-group. A previous study explored the evolution of the R subunits of PKA [38]. Based on a multiple sequence alignment (MSA) of the most conserved region in the R subunit sequences, the phosphatebinding cassette, the authors proposed a new classification of the R subunits in place of a classification based on physiochemical properties. They also identified a signature sequence that characterizes the R subunit genes, and identified type- and subtype-specific residues. The same focused analysis of the C subunits of PKA is to our knowledge lacking. In order to get a comprehensive overview of all the known PKA genes and obtain insight into the essential residues of these proteins, we have collected, compared and performed an extensive analysis of PKA C subunit sequences from a large number of bilaterian animal species. The homologous genes PRKACA and PRKACB constitute a unique clade of kinases and their protein products are the main sources of PKA activity in the cell. Therefore, we focused our analysis on the Ca/Cb-like kinases, including Cc, but not the more remotely related kinases PRKX and PRKY. The main focus of this study has been on elucidating the phylogeny of vertebrate and other chordate PKA Ca/Cb homologs, while orthologous sequences from mollusks and arthropods were included mainly to serve as outgroups in the phylogenetic analysis. We show that an ancestor C subunit was duplicated around the time of the evolution of the first vertebrate species, giving rise to the paralogous genes encoding Ca and Cb. Further analysis on the Ca/Cb homologs revealed signature sequences characteristic of the two paralogs. Comparison of Ca and Cb sequences gave insight into molecular differences and possible mechanisms that may determine functional differences between these two prototype protein kinases. Materials and Methods Sequences of PKA Ca/Cb Homologs Homologous sequences of human PKA Ca/Cb were obtained from the UniProt [39] and NCBI [40] database resources and from the Ensembl project [41] as described in Materials and Methods S1. Standard BLAST sequence searching algorithms were employed [42] and only sequences with sequence identity (protein level) compared with human PKA Ca or Cb above 75% were included in the dataset for further analysis. The final dataset comprised 41 sequences from placental mammals, including 18 from primates, 5 marsupial sequences, 35 sequences from nonmammalian vertebrates, and 15 from invertebrates (See Materials and Methods S1 and Table S1). Both nucleotide and protein sequences, one single splice variant for each gene, were stored in FASTA format for the total 96 sequences. 81 sequences appear to be full-length, comprising exons 2 to 10 and in addition a 59 exon chosen to correspond to human Ca1/Cb1 when possible. Of the incomplete sequences, 8 were missing only the 59 exon. In addition, one Petromyzon marinus sequence was missing exons 1, 7, 9, and 10, and the two sequences from Macropus eugenii were missing exons 1 and 9, and part of the 39 end, respectively. Finally, the four cartilaginous fish sequences only contain fragments of the full-length sequence. The sequences were aligned with MUSCLE [43] and the MSAs were viewed and edited with Jalview [44] (See Figure S1). All analysis was subsequently based on MSAs corresponding to exons 2 10 of human PKA Ca1/Cb1 (i.e. residues , Fig. 1B) unless otherwise stated, and columns of the MSAs containing gaps were deleted. PHYLIP and NEXUS format files were generated from the FASTA files with a dedicated Perl script. The best model for nucleotide evolution was determined with ModelTest 3.7 [45] in combination with PAUP* (D. L. Swofford, PAUP*, Sinauer Associates, Sunderland, MA) as chosen by the Akaike Information Criterion (AIC). The evolutionary model selected was the general time reversal (GTR) model with a discontinuous gamma distribution (C) for modeling rate heterogeneity over sites and a proportion of invariant sites (I), i.e. GTR+C+I with four rate categories. The best fitting sequence substitution model for the protein data was determined with ProtTest 2.4 [46], and was found to be, according to the AIC, LG+C+I [47]. Also the second best model, JTT+C+I [48], was tested for phylogenetic tree construction. Phylogenetic Analysis Bayesian inference of phylogeny was carried out with MrBayes [49,50] with default heating parameters (three heated Markov chain Monte Carlo chains and one cold) and priors. Two simultaneous and independent runs were carried out with sampling every 10 of 500 k generations until average standard deviation of split frequencies were below Branch lengths and majority rule consensus tree topologies were calculated after discarding a burn-in of 100 k generations after which stationarity had been reached. For all calculations presented, the final potential scale reduction factor (PSRF) was below for all parameters. Phylogenetic trees were also generated with the Maximum Likelihood (ML) method employing PhyML 3.0 [51] with default parameters. The robustness of each clade was estimated by a nonparametric bootstrap analysis with 1000 replicates. Gamma distribution parameters, the proportion of invariable sites, branch PLOS ONE 2 April 2013 Volume 8 Issue 4 e60935

3 Figure 1. Gene structure and overview of the human PKA catalytic subunits Ca and Cb encoded by the PRKACA and PRKACB genes, respectively. A Exons, introns and 39 and 59 untranslated regions (UTRs) of the PKA catalytic subunit genes are shown. The human Ca gene (PRKACA) is located at chromosome 19p13.1 (reverse strand) and has a length of approximately nucleotides (nt). Alternative transcription start sites give rise to two splice variants known as Ca1 and Ca2 (formerly known as CaS). Both splice variants comprise exons 2 to 10 and in addition a 59 exon 1 1 or 1 2 in Ca1 and Ca2, respectively. The human Cb gene (PRKACB) is located at chromosome 1p31.1 (forward strand), with a length of approximately nt. Alternative splicing of exons 1 1, 1 2, 1 3 and 1 4 give rise to the splice variants Cb1, Cb2, Cb3 and Cb4, respectively. In addition, three short exons, a, b and c, have been shown to be included in the transcript in various combinations. All known enzymatically active isoforms of Cb comprise exons 2 to 10. B Human PKA Ca1 consists of ten a-helix and nine b-strand secondary structure elements [76]. The figure (middle box) gives the location of a-helices (pink, A J) and b-strands (yellow, 1 9) relative to the ten exons and 351 encoded codons of the Ca1 isoform. The locations of the boundaries between exons are given on the upper line. The codons corresponding to the nine introns, as well as their PLOS ONE 3 April 2013 Volume 8 Issue 4 e60935

4 intron phases, are given on the lower line. The intron phase is defined as the position of the intron within a codon, with phase 0, 1, or 2 lying before the first base, after the first base, or after the second base, respectively. Human PKA Cb1 has the same length as Ca1, and the two proteins are differing at only 25 amino acid positions (92.9% sequence identity), strongly suggesting that the overall 3D structures, including secondary structure elements, are close to identical. The position and intron phases for the nine introns are also conserved between human Ca1 and Cb1. The sequence segment corresponding to exons 2 10, termed Core , is shown as a blue bar. C The function of selected important residues and motifs in human PKA catalytic subunits has previously been elucidated in the literature. All listed residues, and their numbering, are identical in Ca1 and Cb1, but the research describing these residues and motifs has mainly been performed on Ca. Numbering of amino acids is given for mature Ca1 and Cb1 (with N- terminal Met removed), both encoding 350 residues. Gly1 is found to be posttranslationally modified by myristoylation [77]. Ser10, Ser139 and Ser338 are well characterized phosphorylation sites [78,79]. The Gly-rich loop (Gly50 Gly55) plays an important role in phosphoryl transfer [80 82]. Lys72 and Asp184 are crucial for ATP and Mg 2+ binding in the active site [81] and the DFG motif (Asp184 Gly186) is conserved in most kinases. The conformation of the motif is critical for the functional state of the kinase [83,84]. Phe327 is the only residue outside of the kinase core binding to the adenine of ATP [82]. Trp196 is an essential residue for R subunit binding and phosphorylation of Thr197 is necessary for the enzyme to assume the active conformation, thereby facilitating catalysis as well as R subunit binding [5,85]. The hydrophobic P+1 motif (Gly200 Glu208) is important for the structure of the enzyme, as well as for substrate recognition [80,86]. Tyr247 competes with camp for R subunit binding [5]. doi: /journal.pone g001 lengths and GTR model parameters were all optimized by the ML algorithm from the data. ML phylogenetic trees were also inferred using RAxML [52], but due to the similarities of the resulting trees only the PhyML results are shown here. Dendroscope [53] was used for visualization of phylogenetic trees. Signature sequence logos were generated by applying WebLogo [54]. Calculations were carried out on the University of Oslo Titan computer cluster, mainly through the freely available Bioportal ( Protein structure illustrations were generated with PyMOL (W. L. DeLano, The PyMOL Molecular Graphics System, Version 1.3, Schrödinger, LLC). Ratios of nonsynonymous and synonymous substitution rates were calculated with the KaKs_Calculator [55]. Results and Discussion The PKA Ca/Cb Family is Highly Conserved in Chordates Vertebrate homologs of human PKA Ca and Cb were extracted from public databases, including Ensembl, UniProt, and database resources provided by the NCBI. New gene models were generated for several of the homologs by careful manual curation (See details in Materials and Methods S1). Homologs, in most cases full-length sequences, were found from 21 placental mammals, the marsupial species opossum (Monodelphis domestica) and wallaby (Macropus eugenii), chicken, zebra finch (Taeniopygia guttata), the Carolina anole lizard (Anolis carolinensis), and two and six species of frogs and bony fishes, respectively. In addition, fulllength sequences were obtained for the non-vertebrate chordates amphioxus (Branchiostoma floridae) and the tunicates Ciona intestinalis and Ciona savignyi, the echinoderm sea urchin (Strongylocentrotus purpuratus), the two mollusks great pond snail (Lymnaea stagnalis) and California sea hare (Aplysia californica), the hemichordate Saccoglossus kowalevskii, the nematode Caenorhabditis elegans, the sponge Amphimedon queenslandica, as well as six arthropods including honey bee, fruit fly, a tick, and a crustacean. Partial sequences were obtained for orthologs from dogfish shark (Squalus acanthias), little skate (Leucoraja erinacea) and sea lamprey (Petromyzon marinus). In all cases both the nucleotide sequences and the corresponding protein sequences were stored. All sequences are listed in Materials and Methods S1. We were unable to detect more than a single, reasonably close, PKA Ca/Cb homolog in any of the invertebrate species that were examined. These includes the cephalochordate amphioxus, the urochordates C. intestinalis, C. savignyi and Oikopleura dioica, the echinoderm sea urchin, as well as arthropods, mollusks, a nematode and a sponge. Two homologs were found in most vertebrates, including sea lamprey, and the chondrichthyes S. acanthias and L. erinacea, while four homologs were detected in a number of bony fishes. The main isoforms of human PRKACA and PRKACB comprises exons 2 10 and in addition one or more 59 exons (Fig. 1A). The length of all exons 2 10 and the positions and intron phases of all nine introns are identical in the two human genes (Fig. 1B), clearly demonstrating that these genes arose due to a gene duplication. This structure, reflecting an extreme degree of conservation, is also conserved in all chordate homologs, including the invertebrate cephalochordate amphioxus and the urochordate tunicates and in addition in the echinoderm sea urchin, the hemichordate S. kowalevskii, and the crustacean Daphnia pulex: the number of exons, their lengths and the codon phases of all introns are identical. The exceptions, apart from the intronless homologs described below, includes the medaka gene encoding a protein with Ensembl identifier ENSORLP This gene appears to have acquired a new 88-nucleotide intron (intron phase 0) in the middle of exon 9. In addition, the basal metazoan A. queenslandica has an additional intron of 334 nucleotides that splits the coding sequence corresponding to vertebrate exon 6. The number of codons encoded by exons 2 10 in the PRKACA/ B family of genes - corresponding to residues in human proteins PKA Ca1/Cb1 - is consequently also identical in all species listed above. We term this sequence segment Core (Fig. 1B) in order to distinguish it from the variable 59 exons. The genomic data for the mollusks are not yet available in the public domain and the intron/exon structure is currently unknown, but in insects the protein coding segment of the PRKACA/B homologs are contained in a single exon. Nevertheless, the insect homologs also have the same number of residues for the segment corresponding to Core An MSA of all sequences with fulllength Core is shown in Figure S1. Consequently, we find that nearly all our PKA Ca/Cb homologs from Bilateria have the same length for the Core , and with variable length N-termini corresponding to the multitude of isoforms. The only exceptions are a single-residue insertion after human PKA Ca1 Lys63, in the middle of exon 3, in both homologs from mollusks and a singleresidue insertion after human PKA Ca1 Ala38 in a fast-evolving sequence from opossum (identifier ENSMODP ). These insertions are in loop structures in PKA Ca1 (See e.g. [56] for the 3D structure) and are expected to be compatible with an unchanged overall 3D structure of the kinase. Pairwise sequence identity between any of the bilaterian PKA Ca/Cb homologs described above is always above 77% for the 335 residues of the Core at the amino acid level. Leaving out the fast-evolving sequences from marsupials and a single fastevolving zebrafish sequence (Q7T374), the sequence identities are always above 80% and 87% within the chordates and vertebrates, respectively. This demonstrates very strong purifying selection and an exceptionally high degree of sequence conservation for this protein family. PLOS ONE 4 April 2013 Volume 8 Issue 4 e60935

5 The PKA Ca/Cb Gene Family Contains Several Putative Retroposons In addition to the PRKACA/B homologs that have conserved exon/intron structure in chordates and several other deuterostomes, and the arthropod homologs, also with several exons, but with the protein coding segment contained in a single exon, a number of intronless PRKACA/B homologs are found in vertebrate genomes. Among these are human PRKACG, located on chromosome 9 between the genes PIP5K1B and FXN, but transcribed in the opposite direction. Intronless PRKACG has the same number of codons as the PKA Ca1-like transcript of PRKACA and is conserved in the great apes, in chimpanzee, gorilla, and orangutan and in the Old World monkeys rhesus macaque and hamadryas baboon (Sequences are listed in Materials and Methods S1). Genome browsing at the Ensembl resource also shows the synteny to be conserved in these species with PIP5K1B and FXN being transcribed in one direction and PRKACG, located between these two genes, in the opposite. Between PIP5K1B and FXN in the gibbon (Nomascus leucogenys) genome, a species more closely related to great apes than the Old World monkeys, there is no full-length PRKACG ortholog, but instead a putative PRKACA/B pseudogene with several frame shifting mutations. We find no evidence of PRKACG orthologs in prosimian primates such as greater galago (Otolemur garnettii), tarsier (Tarsius syrichta), or gray mouse lemur (Microcebus murinus), although the last two of these have genomes that are still fragmented and with undisclosed synteny around the genes PIP5K1B and FXN. In the common marmoset (Callithrix jacchus) genome, there is a fragment, most likely not protein-coding, of PRKACG between PIP5K1B and FXN, corresponding to PKA Ca1 residues Finally, in the mouse and dog genomes, PIP5K1B and FXN are neighboring genes being transcribed in the same direction, but without any sign of a PRKACA/B homolog in this region. These findings strongly support the previous suggestion that PRKACG is a retroposon due to a PKA Ca1-type transcript [35] that has been inserted between PIP5K1B and FXN in a common ancestor of great apes and Old and New World monkeys. The putative PRKACG transcripts could potentially give rise to functional kinases in all great apes and in the Old World monkeys, but not in gibbons and the New World monkey marmoset where there are mutations disrupting the reading frame in the PRKACG retroposon. In the marmoset genome, there is in addition to the putative PRKACG pseudogene on chromosome 1, a second retroposon (Ensembl identifier ENSCJAP ) related to PKA Ca1 on chromosome 2. This gene was not found in other primates. In addition to the PKA Ca1-like retroposons in primates, we found intronless PRKACA homologs in the two sequenced genomes of marsupials, the wallaby kangaroo (M. eugenii) and the Brazilian opossum (M. domestica) (See Materials and Methods S1). Also these putative retroposons appear to be derived from a PKA Ca1-type transcript, but are otherwise unrelated to primate PRKACG. Elucidating the Phylogeny of the PKA Ca/Cb Family is Nontrivial Due to High Sequence Conservation The distribution of PKA Ca/Cb homologs in Bilateria, as well as phylogenetic trees generated with unsophisticated hierarchical clustering methods (results not shown), suggest that this gene family has expanded through repeated gene duplication events in vertebrates. However, despite numerous attempts, state-of-the-art probabilistic methods, both Bayesian inference and maximum likelihood (ML) methods, were not able to generate a statistically strongly supported phylogenetic tree for the full PKA Ca/Cb family from the complete data set. After careful analysis of the data (vide infra), we were nevertheless able to derive reliable phylogenies for this gene family by dividing the data into subsets. The final phylogenetic trees were generated as follows: an MSA was generated from the nucleotide sequences corresponding to the Core segment for selected vertebrates, amphioxus, sea urchin and fruit fly. After removal of all nucleotides at codon position 3, the dataset was employed to generate Bayesian inference and ML trees, with identical topology, for the PKA Ca/Cb homologs with good Bayesian posterior probabilities and bootstrap support for the major nodes. The phylogram was rooted with the sea urchin and fruit fly as outgroups (Fig. 2). Bayesian inference methods were also used to generate a phylogenetic tree of 22 vertebrate PKA Ca orthologs employing human and mouse PKA Cb as outgroups (Fig. 3A). Similarly, a tree with 26 PKA Cb orthologs was generated with human and mouse PKA Ca as outgroups (Fig. 3B). These trees, and trees from the corresponding ML analysis, are based on MSAs for the nucleotide sequences with all three codon positions included. Neither Bayesian inference nor ML methods, generally accepted to be the most accurate [57,58], were able to generate a reliable phylogeny for the PKA Ca/Cb family from the full data set. This was not due to unreliable MSAs as only three of the sequences in the original data set each had single codon insertions (vide supra) and manual removal of these was trivial. The problem, however, appears to be a combination of weak phylogenetic signals and problematic nonphylogenetic signals in the data. Due to the very high level of sequence conservation, the protein data set contains few phylogenetically informative sites, i.e. amino acid sites that favor one phylogenetic tree topology over others. As an example, for the 22 chordate taxa used to generate the tree in Fig. 2, the corresponding amino acid data set has 335 columns/ sites. Of these, only 58 sites are phylogenetically informative, while 255 are fully conserved and identical for all taxa and the remaining 22 are autapomorphic. Unsurprisingly, protein data phylogenetic trees generated with two recommended substitution models (LG/JTT+C+I) and well-tested ML programs (PhyML/ RAxML) were overall fairly similar, but with very poor bootstrap support. The trees generated with various subsets of the available taxa were incongruent and also in several cases inconsistent with the known evolutionary relationship between the various chordate species. In order to secure a stronger phylogenetic signal [59], the nucleotide data set was employed for deriving the phylogenetic relations. For the 22 chordate taxa in Fig. 2, there are 77, 33, and 312 phylogenetically informative sites (out of 335 sites in total) for codon position 1, 2, and 3, respectively. The low number of informative sites at codon position 1 and 2 reflects the high degree of conservation at the amino acid level, while the high number at codon position 3 (93%) reflects the large time-span since the common ancestor of these genes. For all tested nucleotide data sets, the GTR+C+I model was predicted to be superior. While using the nucleotide sequences for deriving the phylogenetic relationships ensures a stronger phylogenetic signal, in particular the data from codon position 3, due to the degeneracy of the genetic code, may be severely mutationally saturated due to reversions and convergences that erase the true phylogenetic signal. This will especially be problematic for inferring ancient phylogenies [59], where long branch attraction (LBA), i.e. a tendency for grouping of lineages with long branches irrespective of their true relationships, may lead to misleading phylogenies [60]. In particular, fast-evolving genes may artificially occur too deeply in the tree due to LBA towards the outgroups. PLOS ONE 5 April 2013 Volume 8 Issue 4 e60935

6 Figure 2. Phylogenetic relationships among the PKA catalytic subunit homologs in chordates. The Ca and Cb paralogs are a result of a gene duplication in a common ancestor of vertebrates. Subsequent duplications of Ca and Cb in a teleost fish ancestor have resulted in four PKA catalytic subunits in these organisms. The Bayesian inference tree is based on the nucleotide sequences (codon positions 1 and 2 only, GTR+C+I model) of exons 2 to 10 which corresponds to a multiple sequence alignment with no gaps. The phylogram is shown with estimated branch lengths proportional to the number of substitutions at each site, as indicated by the scale bar. The arthropod fruit fly (D. melanogaster) and the echinoderm sea urchin (S. purpuratus) have been set as outgroups. Bayesian posterior probabilities are shown for each node. The topology of a maximum likelihood (ML) tree generated with the same data set and model was identical to the Bayesian inference tree. ML bootstrap values are shown for selected nodes (1000 replications). The sequences of human and mouse PKA Ca and Cb and the homologs from amphioxus (B. floridae), zebra finch (T. guttata), chicken (G. gallus), the frog X. tropicalis, the lizard A. carolinensis, medaka (O. latipes), the pufferfish T. rubripes, and stickleback (G. aculeatus) are described in Materials and Methods S1. The X. tropicalis Ca and A. carolinensis Cb are incorrectly placed (See discussion and Fig. 3). doi: /journal.pone g002 As for the protein data set, trees generated with subsets of the available taxa were highly incongruent and in several cases with pronounced LBA, clearly demonstrating a strong nonphylogenetic signal. In order to secure minimal nonphylogenetic signals and reduce LBA, codon position 3 sites were removed from the data set used to generate the phylogeny in Fig. 2. In addition, fast-evolving taxa such as the tunicate [61], marsupial PKA Ca/Cb homologs and the primate PKA Cc homologs were left out of the data set. Fig. 2 is expected to describe the ancient evolution of the PKA Ca/Cb family in chordates correctly, with the gene duplications in the common ancestor of vertebrates as well as in teleost fishes supported by high Bayesian posterior probabilities and strong bootstrap support for the ML analysis. In order to better describe the evolution of the two PKA Ca and PKA Cb paralogs in vertebrates separately, the trees in Fig. 3 were generated. Reliable phylogenies could not be generated from a data set containing codon positions 1 and 2 only, as in Fig. 2, most likely due to weak phylogenetic signals in the data. However, for the relatively recent phylogenetic relations in Fig. 3, the saturation at codon position 3 is expected to be less severe and all three codon positions were included in the data set for analysis. In Fig. 2, there are two errors due to nonphylogenetic signal. These are the placement of frog PKA Ca in a clade together with teleost fish and the erroneous lizard/frog PKA Cb clade. Both these errors are corrected in Fig. 3 upon inclusion of codon position 3 data. The most ancient splittings, however, are unreliable in Fig. 3, in particular the description of the branching between the tetrapod and teleost fish PKA Ca homologs as a multifurcation (Bayesian analysis) or two bipartitions with extremely poor bootstrap support (ML analysis, not shown) in Fig. 3A. In conclusion, while the phylogenies of the PKA Ca and PKA Cb subfamilies, especially within the tetrapods, appears to be correctly inferred from the full nucleotide dataset (Fig. 3), the ancient evolution of the PKA Ca/Cb family in chordates is correctly described with cladistic methods only after removal of codon position 3 data (Fig. 2). The protein data set contains PLOS ONE 6 April 2013 Volume 8 Issue 4 e60935

7 Figure 3. The Bayesian inference trees for vertebrate PKA Ca and Cb both closely reflects the evolutionary relationships among these organisms. A Phylogenetic analysis of Ca orthologs resulted in a tree that was rooted with human and mouse Cb as outgroups. The tree was based on the nucleotide sequences of exons 2 to 8 (all codon positions, GTR+C+I model). B Phylogenetic analysis of Cb orthologs was performed PLOS ONE 7 April 2013 Volume 8 Issue 4 e60935

8 employing nucleotide sequence data (all codon positions, exons 2 to 10, GTR+C+I model). The resulting tree was rooted with human and mouse Ca as outgroups. In both trees, branch lengths are shown as substitutions per site, with scale indicated by the scale bars. Bayesian posterior probabilities are given for each node and ML bootstrap values (1000 replications) are shown for selected nodes where the clades are identical in the Bayesian and ML analysis. In addition to organisms found in Fig. 2, representative sequences from the following species were included: eutherian mammals rhesus macaque (M. mulatta), tarsier (T. syrichta), dog (C. familiaris), horse (E. caballus), pig (S. scrofa), cow (B. taurus), rat (R. norvegicus), and hamster (C. griseus), marsupial mammals wallaby (M. eugenii) and opossum (M. domestica), the frog X. laevis, the pufferfish T. nigroviridis and Atlantic salmon (S. salar). See Materials and Methods S1 for the sequence data. doi: /journal.pone g003 limited phylogenetic information and does not give reliable phylogenies in chordates. No analysis based on partitioned data, for example according to codon position, was attempted, as this is likely to lead to overparametrization. Phylogenetic Inferences for the PKA Ca/Cb Family in the Chordate Lineage The well-resolved phylogenetic tree in Fig. 2 strongly suggests the following sequence of events during the evolution of the PKA Ca/Cb gene family: a single PKA Ca/Cb-like gene in the common ancestor of chordates, arthropods and echinoderms was duplicated in a common ancestor of vertebrates, which lead to the two paralogous genes corresponding to PKA Ca and Cb. These two genes were again duplicated in a common ancestor of teleost fishes, leading to paralogs that we suggest are denoted PKA Ca-I, Ca-II, Cb-I, and Cb-II. The data indicates that the first PKA Ca/Cb gene duplication, resulting in PKA Ca and PKA Cb, took place after the divergence of the urochordate and cephalochordate lineages. Currently there are only fragments of PKA Ca/Cb homologs available in public databases for the chondrichthyes (dogfish shark and little skate) and the cyclostome sea lamprey (Table S1), but the PKA Ca/Cb homologs also in these organisms appear to occur in pairs. Unfortunately, the phylogenetic signal in the data is too weak to classify these sequences as PKA Ca or PKA Cb, and to exclude the possibility that these paralogous gene pairs are results of independent gene duplications, but the most parsimonious explanation for this distribution of PKA Ca/Cb homologs is that a single PKA Ca/Cb gene duplication occurred before the divergence of the jawless fish lineage and the subsequent divergence of sharks and skates. Interestingly, this timing of the PKA Ca/Cb gene duplication coincides with the two rounds (2R) of whole genome duplication (2R hypothesis) that took place after the emergence of the invertebrate chordates and before the radiation of jawed vertebrates [62 64]. The PKA Ca and PKA Cb gene split is thus likely to have occurred in the Cambrian, roughly 500 Mya [65,66]. Canaves and Taylor [38] found that the gene duplications of the PKA regulatory subunits resulting in the paralogs RIa and RIb as well as RIIa and RIIb also occurred in the chordate lineage, suggesting that the gene duplications of the R and C subunits might have taken place simultaneously. The secondary duplications of PKA Ca and PKA Cb (Figs. 2 and 3) appear to be unique to teleost fishes, and might have coincided with the teleost whole genome duplication that took place Mya [67,68]. Finally, the presence of two X. laevis PKA Ca paralogs and a single X. tropicalis PKA Ca (Fig. 3A) in the amphibian genomes is consistent with the recent whole genome duplication event in the common ancestor of the X. laevis group not found in X. tropicalis [69,70]. As expected, both the subtrees for PKA Ca (Fig. 3A) and PKA Cb (Fig. 3B) have eutherian clades with the marsupial orthologs as sister clades, and with Sauropsida (birds and reptiles, Fig. 3B only) and frogs appearing as sister clades deeper into the phylogenetic tree. We were unable to find the PKA Ca gene in any of the released genomes of Sauropsida, and this clade is consequently missing in Fig. 3A. However, a single EST (expressed sequence tag) sequence from a chicken testis library (GenBank identifier CN [71]) appears to confirm the presence of PKA Ca also in birds. Several vertebrate species are missing either PKA Ca or PKA Cb in the current genomic data sets, but this is most likely due to low sequence coverage in unfinished genomes. We find no evidence for extensive PKA Ca or Cb gene loss in any of the main vertebrate groups. The marsupial M. domestica PKA Cb appears to be particularly fast-evolving (Fig. 3B). The M. eugenii PKA Cb ortholog is present in the genome, but the full sequence is currently unknown. Both marsupial genomes [72,73] have two PKA Ca paralogs. Marsupial PKA Ca-I has the same exon/intron structure as PKA Ca in all other mammals. Marsupial PKA Ca-II, however, is intronless and a putative PKA Ca retroposon (Fig. 3A). The full-length sequences are not at present available for all the marsupial PKA Ca homologs, but interestingly, intronless PKA Ca-II appears to be significantly more conserved than the ancestral variant PKA Ca-I. The 59 segment of the two marsupial PKA Ca-II orthologs (only the codons are available in the sequence for M. eugenii) have 17 synonymous and no non-synonymous mutations, suggesting strong purifying selection. For PKA Ca-I, the 294 codons corresponding to exons 2 9 (exons 1 and 10 are missing in the current M. eugenii genomic sequence) have 70 mutations, including a 3 nucleotide insertion in opossum PKA Ca-I, and 21 of these are non-synonymous. These data, although limited, strongly suggests that the retroposon PKA Ca-II have become functional in marsupials and that the purifying selection acting upon PKA Ca-I and PKA Cb has become less stringent. Vertebrate PKA Ca and Cb Mainly Differs in the C-tail and in Subdomains I and II In order to elucidate the potential differences between the PKA Ca and PKA Cb protein subfamilies, an MSA of all available vertebrate homologs was generated. In this set of 27 PKA Ca and 33 PKA Cb, there are within the 335 sites/columns of the Core only 62 sites (19%), that are phylogenetically informative while 235 sites are fully conserved in all taxa, again reflecting the very high degree of purifying selection. A manual inspection of the MSA showed that eleven sites/columns could tentatively be used to discriminate between the two PKA subfamilies. Sequence logos for these eleven sites are shown in Fig. 4. Human PKA Ca1 residues Gln35, Thr37, Glu64, Gly66, His68, Ser109, and Glu334 are fully conserved in all vertebrate PKA Ca, while at the corresponding sites in PKA Cb the sequence conservation is less stringent. Similarly, human PKA Cb1 residues Asp42, Gln67, Arg319 are fully conserved in vertebrate PKA Cb (Fig. 4). The single site where there is no overlap between amino acid use in the two subfamilies is at residues 66 where PKA Ca and Cb have Gly and Glu/Asn/Asp, respectively. The residues that correspond to the eleven sites that tentatively discriminates between the subfamilies PKA Ca and Cb are all exposed at the protein surface, in or close to loop structures, in subdomains I and II and in the C-tail (Fig. 5). None of these residues are located close to the kinase active site and their identity PLOS ONE 8 April 2013 Volume 8 Issue 4 e60935

9 Figure 4. The identity of eleven amino acids in the protein chain may define the Ca and Cb branches of PKA catalytic subunits. Our full set of PKA catalytic subunits (Materials and Methods S1) from bony fishes and tetrapods, comprising 27 Ca and 33 Cb, was employed to identify eleven amino acid positions that together may be used to classify a PKA catalytic subunit as belonging to one of the two branches. The sequence logos define the PKA Ca and Cb clades within the Teleostomi, which includes the familiar classes of bony fishes, birds, mammals, reptiles, and amphibians. We find invariable Gln35, Thr37, Glu64, Gly66, His68, Ser109 and Glu334 in Ca and invariable Asp42, Gln67, and Arg319 in Cb (Ca1/Cb1 numbering). The residues in the corresponding positions in human Ca1 and Cb1 are also shown. doi: /journal.pone g004 is not likely to affect PKA kinase activity. Likewise, they are not located near the protein surface segments that are known to interact with the PKA regulatory subunits (Fig. 5B), and these residues should not be important for R subunit interactions. Several of the eleven residues, especially residues 64, 319, 334 and 340, are protruding their side chains into the solvent and might be targets for PKA Ca- or PKA Cb-specific post-translational modifications and/or protein-protein interactions. Particularly, Glu64 is conserved in all vertebrate PKA Ca, while residue 64 is variable in 33 Cb, being Ala in all mammals and Glu in only three fish homologs. In PKA Cb Glu66 is absolutely conserved, except in five of the fish homologs, while this residue is conserved as Gly in Ca. Strong Purifying Selection is Lost in PKA Cc A comparison of PKA Ca1 from human and the galago, a prosimian primate, shows that there are 59 and 4 synonymous (silent) and non-synonymous (amino acid changing) mutations, respectively. Between human and macaque PKA Cc there are slightly fewer, 44, mutations, but from these 31 amino acid changes, indicating that the strong purifying selection in the PKA Ca lineage is lost in PKA Cc. A powerful tool for evaluating the evolution of protein coding sequences, is calculating the ratio of the non-synonymous (Ka) and synonymous (Ks) substitution rates, where Ks is the number of synonymous substitutions per synonymous site and Ka is the number of non-synonymous substitutions per non-synonymous site [74]. A Ka/Ks,1 indicates purifying (negative) selection, Ka/Ks.1 is a sign of positive selection, while Ka/Ks, 1 indicates neutral evolution of the protein. Table 1 shows Ka/Ks for comparisons of five representative tetrapod PKA Ca homologs and five primate PKA Cc homologs. Due to missing sequences in the databases it was not possible to compare the same species for Ca and Cc. For the PKA Ca sequences, the average Ka/Ks is PKA Ca sequences were also compared for other placental mammals and Ka/Ks were found to be in the range Ka/Ks for a comparison of human PKA Ca and Cb, and of the human and mouse PKA Cb orthologs, were and , respectively. These data for the PKA Ca/Cb homologs in placental mammals again confirms the very strong purifying selection acting upon these kinases. The Ka/Ks values from the comparison of the primate PKA Cc sequences in Table 1 are in the range Due to the fairly close evolutionary relationships between these species, the number of mutations in PRKACG transcripts between these species is rather low, for example 9, 31 and 46 between human and chimpanzee, orangutan, and macaque, respectively. Consequently, the Ka/Ks values are not expected to be highly reliable, but the average value of 0.45 clearly suggests that the strong purifying selection in the PKA Ca lineage is lost in PKA Cc. This finding that mutations in PKA Cc appear to be neutral, combined with the loss of a functional Cc in gibbons and marmoset (vide supra), suggest that there are no evolutionary constraints on maintaining a functional PKA Cc protein in higher primates. However, this analysis does PLOS ONE 9 April 2013 Volume 8 Issue 4 e60935

10 Figure 5. Signature residues defining PKA Ca and Cb do not interact with ATP, peptide inhibitor PKIa or the kinase regulatory subunit. A The tentative signature residues of PKA Ca and Cb (Fig. 4) are highlighted in a structural model of PKA Ca1 in complex with a truncated PKIa (residues 6 25). Signature amino acids in human Ca1 and Cb1 are shown without and within parenthesis, respectively. The conserved kinase core has been divided into subdomains (represented in different colors) [87] as defined by Hanks and Hunter [88]. ATP is rendered as sticks (black) and two divalent cations as black spheres. The model is based on the experimental structure of Thompson et al. [56] (PDB identifier 3FJQ). B Residues in Ca1 (cyan) interacting with regulatory subunit RIa (purple, residues of bovine RIa only) are mainly restricted to the large lobe and do not overlap with any of the Ca signature residues (red). The complex is shown with the same orientation of Ca1 as in panel A (left) and rotated 180u (right). The model is based on the experimental structure of Kim et al. [5] (PDB identifier 3FHI). doi: /journal.pone g005 PLOS ONE 10 April 2013 Volume 8 Issue 4 e60935

11 Table 1. Ratio of amino acid replacing (Ka) and silent (Ks) mutation rates for all pairwise comparisons of five tetrapod PKA Ca homologs and five primate PKA Cc homologs. Ka/Ks a H. sapiens Ca O. garnettii Ca M. musculus Ca B. taurus Ca O. garnettii Ca (0.0158) M. musculus Ca (0.0141) (0.0226) B. taurus Ca (0.0063) (0.0111) (0.0158) X. tropicalis Ca (0.0060) (0.0056) (0.0064) (0.0060) Ka/Ks b H. sapiens Cc P. troglodytes Cc P. pygmaeus Cc G. gorilla Cc P. troglodytes Cc (0.127) P. pygmaeus Cc (0.324) (0.194) G. gorilla Cc (1.207) (0.148) (0.319) M. mulatta Cc (0.678) (0.406) (0.393) (0.603) a Ka/Ks ratios for pairwise comparisons of PKA Ca from human, a prosimian primate (O. garnettii), mouse (M. musculus), cattle (B. taurus) and a frog (X. tropicalis) calculated according to the model of Goldman and Yang [89] and the model averaging method of Zhang et al. (in parenthesis) [90] for the Core sequence segment. b Ka/Ks ratios for pairwise comparisons of PKA Cc from human, chimpanzee (P. troglodytes), orangutan (P. pygmaeus), gorilla (G. gorilla) and rhesus macaque (M. mulatta) calculated as described above. doi: /journal.pone t001 not exclude the possibility that the PKA Cc transcript has an important function in humans, great apes and other Simiiformes, for example in regulation of PKA Ca/Cb transcript processing [75]. Conclusion We have shown that the PKA Ca and Cb catalytic subunits found in chordates and other animal species builds a phylogenetic clade of kinases with a very high degree of conservation at the protein level. In the core segment corresponding to exons 2 10 of vertebrate Ca1/Cb1 the synonymous mutation rate is approximately two orders of magnitude larger than the amino acid changing mutation rate. All the main residues and sequence segments previously shown to be important for human Ca/Cb function (Fig. 1C), including the phosphorylation sites, the ATP and Mg 2+ interacting residues and the DFG and P+1 motifs, are basically fully conserved in all homologs in chordates, insects and other animal sequences investigated in the current study. The few residues that differ in Ca and Cb (Fig. 4) should be investigated in order to elucidate possible functional differences between the two paralogs. Finally, the Ca1-derived expressed retroposon Cc found in higher primates appears to be evolving neutrally and appears to have no function as a mature protein. Supporting Information Figure S1 Multiple sequence alignment of segment corresponding to exons 2 10 of human PKA Ca1 (i.e. References 1. Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S (2002) The protein kinase complement of the human genome. Science 298: Skålhegg BS, Taskén K (2000) Specificity in the camp/pka signaling pathway. Differential expression, regulation, and subcellular localization of subunits of PKA. Front Biosci 5: D678 D Taylor SS, Yang J, Wu J, Haste NM, Radzio-Andzelm E, et al. (2004) PKA: a portrait of protein kinase dynamics. Biochim Biophys Acta 1697: residues ). All sequences are described in Materials and Methods S1. Residue numbering of human PKA Ca1 (identifier P17612) is shown above the sequence. Sequences belonging to the Ca, Cc, and Cb clades are marked at the right by a green, blue, and red bar, respectively. The figure was prepared with Jalview [44]. (PDF) Table S1 Sequence data for PKA catalytic subunit homologs from chordates collected and manipulated as described in Materials and Methods S1. (PDF) Materials and Methods S1 collection. (PDF) Acknowledgments Details on sequence data We are grateful to Russell Orr and Kamran Shalchian-Tabrizi for help and useful comments and the University of Oslo Bioportal for computational infrastructure. Author Contributions Conceived and designed the experiments: KS TJ TR BSS JKL. Performed the experiments: KS JKL. Analyzed the data: KS TJ TR BSS JKL. Contributed reagents/materials/analysis tools: KS JKL. Wrote the paper: KS TJ TR BSS JKL. 4. Krebs EG, Beavo JA (1979) Phosphorylation-dephosphorylation of enzymes. Annu Rev Biochem 48: Kim C, Xuong NH, Taylor SS (2005) Crystal structure of a complex between the catalytic and regulatory (RIa) subunits of PKA. Science 307: Anand GS, Krishnamurthy S, Bishnoi T, Kornev A, Taylor SS, et al. (2010) Cyclic AMP- and (R p )-camps-induced conformational changes in a complex of the catalytic and regulatory (RIa) subunits of cyclic AMP-dependent protein kinase. Mol Cell Proteomics 9: PLOS ONE 11 April 2013 Volume 8 Issue 4 e60935

Comparing Genomes! Homologies and Families! Sequence Alignments!

Comparing Genomes! Homologies and Families! Sequence Alignments! Comparing Genomes! Homologies and Families! Sequence Alignments! Allows us to achieve a greater understanding of vertebrate evolution! Tells us what is common and what is unique between different species

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis 10 December 2012 - Corrections - Exercise 1 Non-vertebrate chordates generally possess 2 homologs, vertebrates 3 or more gene copies; a Drosophila

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

Supplemental Figure 1.

Supplemental Figure 1. Supplemental Material: Annu. Rev. Genet. 2015. 49:213 42 doi: 10.1146/annurev-genet-120213-092023 A Uniform System for the Annotation of Vertebrate microrna Genes and the Evolution of the Human micrornaome

More information

Molecular evolution. Joe Felsenstein. GENOME 453, Autumn Molecular evolution p.1/49

Molecular evolution. Joe Felsenstein. GENOME 453, Autumn Molecular evolution p.1/49 Molecular evolution Joe Felsenstein GENOME 453, utumn 2009 Molecular evolution p.1/49 data example for phylogeny inference Five DN sequences, for some gene in an imaginary group of species whose names

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

Major Gene Families in Humans and Their Evolutionary History Prof. Yoshihito Niimura Prof. Masatoshi Nei

Major Gene Families in Humans and Their Evolutionary History Prof. Yoshihito Niimura Prof. Masatoshi Nei Major Gene Families in Humans Yoshihito Niimura Tokyo Medical and Dental University and Masatoshi Nei Pennsylvania State University 1 1. Multigene family Contents 2. Olfactory receptors (ORs) 3. OR genes

More information

1 ATGGGTCTC 2 ATGAGTCTC

1 ATGGGTCTC 2 ATGAGTCTC We need an optimality criterion to choose a best estimate (tree) Other optimality criteria used to choose a best estimate (tree) Parsimony: begins with the assumption that the simplest hypothesis that

More information

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Comparative genomics and proteomics Species available Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Vertebrates: human, chimpanzee, mouse, rat,

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly Comparative Genomics: Human versus chimpanzee 1. Introduction The chimpanzee is the closest living relative to humans. The two species are nearly identical in DNA sequence (>98% identity), yet vastly different

More information

Chapter 19: Taxonomy, Systematics, and Phylogeny

Chapter 19: Taxonomy, Systematics, and Phylogeny Chapter 19: Taxonomy, Systematics, and Phylogeny AP Curriculum Alignment Chapter 19 expands on the topics of phylogenies and cladograms, which are important to Big Idea 1. In order for students to understand

More information

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family Jieming Shen 1,2 and Hugh B. Nicholas, Jr. 3 1 Bioengineering and Bioinformatics Summer

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

Sequence Based Bioinformatics

Sequence Based Bioinformatics Structural and Functional Analysis of Inosine Monophosphate Dehydrogenase using Sequence-Based Bioinformatics Barry Sexton 1,2 and Troy Wymore 3 1 Bioengineering and Bioinformatics Summer Institute, Department

More information

GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny

GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny Phylogenetics and chromosomal synteny of the GATAs 1273 GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny CHUNJIANG HE, HANHUA CHENG* and RONGJIA ZHOU* Department

More information

Chapter 16: Reconstructing and Using Phylogenies

Chapter 16: Reconstructing and Using Phylogenies Chapter Review 1. Use the phylogenetic tree shown at the right to complete the following. a. Explain how many clades are indicated: Three: (1) chimpanzee/human, (2) chimpanzee/ human/gorilla, and (3)chimpanzee/human/

More information

Emily Blanton Phylogeny Lab Report May 2009

Emily Blanton Phylogeny Lab Report May 2009 Introduction It is suggested through scientific research that all living organisms are connected- that we all share a common ancestor and that, through time, we have all evolved from the same starting

More information

GCD3033:Cell Biology. Transcription

GCD3033:Cell Biology. Transcription Transcription Transcription: DNA to RNA A) production of complementary strand of DNA B) RNA types C) transcription start/stop signals D) Initiation of eukaryotic gene expression E) transcription factors

More information

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: m Eukaryotic mrna processing Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: Cap structure a modified guanine base is added to the 5 end. Poly-A tail

More information

Session 5: Phylogenomics

Session 5: Phylogenomics Session 5: Phylogenomics B.- Phylogeny based orthology assignment REMINDER: Gene tree reconstruction is divided in three steps: homology search, multiple sequence alignment and model selection plus tree

More information

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Introduction to Comparative Protein Modeling. Chapter 4 Part I Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature

More information

Genomes and Their Evolution

Genomes and Their Evolution Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Name: Class: Date: ID: A

Name: Class: Date: ID: A Class: _ Date: _ Ch 17 Practice test 1. A segment of DNA that stores genetic information is called a(n) a. amino acid. b. gene. c. protein. d. intron. 2. In which of the following processes does change

More information

Supporting Information

Supporting Information Supporting Information Das et al. 10.1073/pnas.1302500110 < SP >< LRRNT > < LRR1 > < LRRV1 > < LRRV2 Pm-VLRC M G F V V A L L V L G A W C G S C S A Q - R Q R A C V E A G K S D V C I C S S A T D S S P E

More information

Biased amino acid composition in warm-blooded animals

Biased amino acid composition in warm-blooded animals Biased amino acid composition in warm-blooded animals Guang-Zhong Wang and Martin J. Lercher Bioinformatics group, Heinrich-Heine-University, Düsseldorf, Germany Among eubacteria and archeabacteria, amino

More information

Genome-wide analysis of the MYB transcription factor superfamily in soybean

Genome-wide analysis of the MYB transcription factor superfamily in soybean Du et al. BMC Plant Biology 2012, 12:106 RESEARCH ARTICLE Open Access Genome-wide analysis of the MYB transcription factor superfamily in soybean Hai Du 1,2,3, Si-Si Yang 1,2, Zhe Liang 4, Bo-Run Feng

More information

Dynamic evolution of the GnRH receptor gene family in vertebrates

Dynamic evolution of the GnRH receptor gene family in vertebrates Williams et al. BMC Evolutionary Biology 2014, 14:215 RESEARCH ARTICLE Open Access Dynamic evolution of the GnRH receptor gene family in vertebrates Barry L Williams 1,2, Yasuhisa Akazome 3, Yoshitaka

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

Sequences, Structures, and Gene Regulatory Networks

Sequences, Structures, and Gene Regulatory Networks Sequences, Structures, and Gene Regulatory Networks Learning Outcomes After this class, you will Understand gene expression and protein structure in more detail Appreciate why biologists like to align

More information

Multiple Sequence Alignment. Sequences

Multiple Sequence Alignment. Sequences Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe

More information

Investigation 3: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST

Investigation 3: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST Investigation 3: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST Introduction Bioinformatics is a powerful tool which can be used to determine evolutionary relationships and

More information

1. In most cases, genes code for and it is that

1. In most cases, genes code for and it is that Name Chapter 10 Reading Guide From DNA to Protein: Gene Expression Concept 10.1 Genetics Shows That Genes Code for Proteins 1. In most cases, genes code for and it is that determine. 2. Describe what Garrod

More information

Cladistics and Bioinformatics Questions 2013

Cladistics and Bioinformatics Questions 2013 AP Biology Name Cladistics and Bioinformatics Questions 2013 1. The following table shows the percentage similarity in sequences of nucleotides from a homologous gene derived from five different species

More information

Master Biomedizin ) UCSC & UniProt 2) Homology 3) MSA 4) Phylogeny. Pablo Mier

Master Biomedizin ) UCSC & UniProt 2) Homology 3) MSA 4) Phylogeny. Pablo Mier Master Biomedizin 2018 1) UCSC & UniProt 2) Homology 3) MSA 4) 1 12 a. All of the sequences in file1.fasta (https://cbdm.uni-mainz.de/mb18/) are homologs. How many groups of orthologs would you say there

More information

Energy and Cellular Metabolism

Energy and Cellular Metabolism 1 Chapter 4 About This Chapter Energy and Cellular Metabolism 2 Energy in biological systems Chemical reactions Enzymes Metabolism Figure 4.1 Energy transfer in the environment Table 4.1 Properties of

More information

Eukaryotic vs. Prokaryotic genes

Eukaryotic vs. Prokaryotic genes BIO 5099: Molecular Biology for Computer Scientists (et al) Lecture 18: Eukaryotic genes http://compbio.uchsc.edu/hunter/bio5099 Larry.Hunter@uchsc.edu Eukaryotic vs. Prokaryotic genes Like in prokaryotes,

More information

Bioinformatics Exercises

Bioinformatics Exercises Bioinformatics Exercises AP Biology Teachers Workshop Susan Cates, Ph.D. Evolution of Species Phylogenetic Trees show the relatedness of organisms Common Ancestor (Root of the tree) 1 Rooted vs. Unrooted

More information

Genomics and bioinformatics summary. Finding genes -- computer searches

Genomics and bioinformatics summary. Finding genes -- computer searches Genomics and bioinformatics summary 1. Gene finding: computer searches, cdnas, ESTs, 2. Microarrays 3. Use BLAST to find homologous sequences 4. Multiple sequence alignments (MSAs) 5. Trees quantify sequence

More information

7. Tests for selection

7. Tests for selection Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

More information

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X

More information

Title slide (1) Tree of life 1891 Ernst Haeckel, Title on left

Title slide (1) Tree of life 1891 Ernst Haeckel, Title on left MDIBL talk July 14, 2005 The Evolution of Cytochrome P450 in animals. Title slide (1) Tree of life 1891 Ernst Haeckel, Title on left My opening slide is a collage (2) containing 35 eukaryotic species with

More information

Exploring Evolution & Bioinformatics

Exploring Evolution & Bioinformatics Chapter 6 Exploring Evolution & Bioinformatics Jane Goodall The human sequence (red) differs from the chimpanzee sequence (blue) in only one amino acid in a protein chain of 153 residues for myoglobin

More information

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family Review: Gene Families Gene Families part 2 03 327/727 Lecture 8 What is a Case study: ian globin genes Gene trees and how they differ from species trees Homology, orthology, and paralogy Last tuesday 1

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

2 Genome evolution: gene fusion versus gene fission

2 Genome evolution: gene fusion versus gene fission 2 Genome evolution: gene fusion versus gene fission Berend Snel, Peer Bork and Martijn A. Huynen Trends in Genetics 16 (2000) 9-11 13 Chapter 2 Introduction With the advent of complete genome sequencing,

More information

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11 The Eukaryotic Genome and Its Expression Lecture Series 11 The Eukaryotic Genome and Its Expression A. The Eukaryotic Genome B. Repetitive Sequences (rem: teleomeres) C. The Structures of Protein-Coding

More information

Sequence analysis and comparison

Sequence analysis and comparison The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species

More information

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class

More information

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26 Phylogeny Chapter 26 Taxonomy Taxonomy: ordered division of organisms into categories based on a set of characteristics used to assess similarities and differences Carolus Linnaeus developed binomial nomenclature,

More information

Supplementary Materials for

Supplementary Materials for advances.sciencemag.org/cgi/content/full/3/4/e1600663/dc1 Supplementary Materials for A dynamic hydrophobic core orchestrates allostery in protein kinases Jonggul Kim, Lalima G. Ahuja, Fa-An Chao, Youlin

More information

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29):

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29): Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Jan 27 & 29): Statistical estimation of models of sequence evolution Phylogenetic inference using maximum likelihood:

More information

Research Article HomoKinase: A Curated Database of Human Protein Kinases

Research Article HomoKinase: A Curated Database of Human Protein Kinases ISRN Computational Biology Volume 2013, Article ID 417634, 5 pages http://dx.doi.org/10.1155/2013/417634 Research Article HomoKinase: A Curated Database of Human Protein Kinases Suresh Subramani, Saranya

More information

Chapters 25 and 26. Searching for Homology. Phylogeny

Chapters 25 and 26. Searching for Homology. Phylogeny Chapters 25 and 26 The Origin of Life as we know it. Phylogeny traces evolutionary history of taxa Systematics- analyzes relationships (modern and past) of organisms Figure 25.1 A gallery of fossils The

More information

5/4/05 Biol 473 lecture

5/4/05 Biol 473 lecture 5/4/05 Biol 473 lecture animals shown: anomalocaris and hallucigenia 1 The Cambrian Explosion - 550 MYA THE BIG BANG OF ANIMAL EVOLUTION Cambrian explosion was characterized by the sudden and roughly simultaneous

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

COMPARING DNA SEQUENCES TO UNDERSTAND EVOLUTIONARY RELATIONSHIPS WITH BLAST

COMPARING DNA SEQUENCES TO UNDERSTAND EVOLUTIONARY RELATIONSHIPS WITH BLAST Big Idea 1 Evolution INVESTIGATION 3 COMPARING DNA SEQUENCES TO UNDERSTAND EVOLUTIONARY RELATIONSHIPS WITH BLAST How can bioinformatics be used as a tool to determine evolutionary relationships and to

More information

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison 10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:

More information

HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM

HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM I529: Machine Learning in Bioinformatics (Spring 2017) HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington

More information

Multiple Choice Review- Eukaryotic Gene Expression

Multiple Choice Review- Eukaryotic Gene Expression Multiple Choice Review- Eukaryotic Gene Expression 1. Which of the following is the Central Dogma of cell biology? a. DNA Nucleic Acid Protein Amino Acid b. Prokaryote Bacteria - Eukaryote c. Atom Molecule

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Lecture 10: Cyclins, cyclin kinases and cell division

Lecture 10: Cyclins, cyclin kinases and cell division Chem*3560 Lecture 10: Cyclins, cyclin kinases and cell division The eukaryotic cell cycle Actively growing mammalian cells divide roughly every 24 hours, and follow a precise sequence of events know as

More information

Browsing Genomic Information with Ensembl Plants

Browsing Genomic Information with Ensembl Plants Browsing Genomic Information with Ensembl Plants Etienne de Villiers, PhD (Adapted from slides by Bert Overduin EMBL-EBI) Outline of workshop Brief introduction to Ensembl Plants History Content Tutorial

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Comparative Bioinformatics Midterm II Fall 2004

Comparative Bioinformatics Midterm II Fall 2004 Comparative Bioinformatics Midterm II Fall 2004 Objective Answer, part I: For each of the following, select the single best answer or completion of the phrase. (3 points each) 1. Deinococcus radiodurans

More information

Primate phylogeny: molecular evidence for a pongid clade excluding humans and a prosimian clade containing tarsiers

Primate phylogeny: molecular evidence for a pongid clade excluding humans and a prosimian clade containing tarsiers Huang, 1 Primate phylogeny: molecular evidence for a pongid clade excluding humans and a prosimian clade containing tarsiers Shi Huang State Key Laboratory of Medical Genetics Xiangya Medical School Central

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Cyclin-Dependent Kinase

Cyclin-Dependent Kinase Cyclin-Dependent Kinase (At a glance) Cory Camasta Cyclin-Dependent Kinase Family E.C: 2.7.11.22 2 = Transferase 2.7 = Transfer of phosphate group 2.7.11 = Serine/Threonine Kinase 2.7.11.22 = Cyclin-dependent

More information

Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene Prediction Errors

Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene Prediction Errors Genes 2011, 2, 449-501; doi:10.3390/genes2030449 Article OPEN ACCESS genes ISSN 2073-4425 www.mdpi.com/journal/genes Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene

More information

Hands-On Nine The PAX6 Gene and Protein

Hands-On Nine The PAX6 Gene and Protein Hands-On Nine The PAX6 Gene and Protein Main Purpose of Hands-On Activity: Using bioinformatics tools to examine the sequences, homology, and disease relevance of the Pax6: a master gene of eye formation.

More information

Group activities: Making animal model of human behaviors e.g. Wine preference model in mice

Group activities: Making animal model of human behaviors e.g. Wine preference model in mice Lecture schedule 3/30 Natural selection of genes and behaviors 4/01 Mouse genetic approaches to behavior 4/06 Gene-knockout and Transgenic technology 4/08 Experimental methods for measuring behaviors 4/13

More information

Supplementary Information

Supplementary Information Supplementary Information Supplementary Figure 1. Schematic pipeline for single-cell genome assembly, cleaning and annotation. a. The assembly process was optimized to account for multiple cells putatively

More information

Introduction. Gene expression is the combined process of :

Introduction. Gene expression is the combined process of : 1 To know and explain: Regulation of Bacterial Gene Expression Constitutive ( house keeping) vs. Controllable genes OPERON structure and its role in gene regulation Regulation of Eukaryotic Gene Expression

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Icm/Dot secretion system region I in 41 Legionella species.

Nature Genetics: doi: /ng Supplementary Figure 1. Icm/Dot secretion system region I in 41 Legionella species. Supplementary Figure 1 Icm/Dot secretion system region I in 41 Legionella species. Homologs of the effector-coding gene lega15 (orange) were found within Icm/Dot region I in 13 Legionella species. In four

More information

Gene regulation II Biochemistry 302. February 27, 2006

Gene regulation II Biochemistry 302. February 27, 2006 Gene regulation II Biochemistry 302 February 27, 2006 Molecular basis of inhibition of RNAP by Lac repressor 35 promoter site 10 promoter site CRP/DNA complex 60 Lewis, M. et al. (1996) Science 271:1247

More information

3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM

3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM I529: Machine Learning in Bioinformatics (Spring 2017) Content HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University,

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

Bahnson Biochemistry Cume, April 8, 2006 The Structural Biology of Signal Transduction

Bahnson Biochemistry Cume, April 8, 2006 The Structural Biology of Signal Transduction Name page 1 of 6 Bahnson Biochemistry Cume, April 8, 2006 The Structural Biology of Signal Transduction Part I. The ion Ca 2+ can function as a 2 nd messenger. Pick a specific signal transduction pathway

More information

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods

More information

Regulation of Gene Expression

Regulation of Gene Expression Chapter 18 Regulation of Gene Expression Edited by Shawn Lester PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley

More information

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16 Genome Evolution Outline 1. What: Patterns of Genome Evolution Carol Eunmi Lee Evolution 410 University of Wisconsin 2. Why? Evolution of Genome Complexity and the interaction between Natural Selection

More information

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

More information

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p.110-114 Arrangement of information in DNA----- requirements for RNA Common arrangement of protein-coding genes in prokaryotes=

More information

A Review of camp-dependent Protein Kinase A Catalytic Subunit Structure, Function and Regulation

A Review of camp-dependent Protein Kinase A Catalytic Subunit Structure, Function and Regulation A Review of camp-dependent Protein Kinase A Catalytic Subunit Structure, Function and Regulation Kaitlyn McLeod* Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ 85721, USA Abstract

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition

Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition David D. Pollock* and William J. Bruno* *Theoretical Biology and Biophysics, Los Alamos National

More information

Lecture Notes: BIOL2007 Molecular Evolution

Lecture Notes: BIOL2007 Molecular Evolution Lecture Notes: BIOL2007 Molecular Evolution Kanchon Dasmahapatra (k.dasmahapatra@ucl.ac.uk) Introduction By now we all are familiar and understand, or think we understand, how evolution works on traits

More information

Comparative Genomics II

Comparative Genomics II Comparative Genomics II Advances in Bioinformatics and Genomics GEN 240B Jason Stajich May 19 Comparative Genomics II Slide 1/31 Outline Introduction Gene Families Pairwise Methods Phylogenetic Methods

More information

Conserved spatial patterns across the protein kinase family

Conserved spatial patterns across the protein kinase family Available online at www.sciencedirect.com Biochimica et Biophysica Acta 1784 (2008) 238 243 www.elsevier.com/locate/bbapap Conserved spatial patterns across the protein kinase family Lynn F. Ten Eyck a,b,c,,

More information

Consensus Methods. * You are only responsible for the first two

Consensus Methods. * You are only responsible for the first two Consensus Trees * consensus trees reconcile clades from different trees * consensus is a conservative estimate of phylogeny that emphasizes points of agreement * philosophy: agreement among data sets is

More information

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed

More information