ISSN: (Print) (Online) Journal homepage:

Size: px
Start display at page:

Download "ISSN: (Print) (Online) Journal homepage:"

Transcription

1 Frontiers in Life Science ISSN: (Print) (Online) Journal homepage: Bioinformatics-based study on prokaryotic, archaeal and eukaryotic nucleic acid-binding proteins for identification of low-complexity and intrinsically disordered regions Birendra Singh Yadav, Swati Singh, Prashant Kumar, Devika Mathur, Raj Kumar Meena, Rishij Kumar Agrawal & Ashutosh Mani To cite this article: Birendra Singh Yadav, Swati Singh, Prashant Kumar, Devika Mathur, Raj Kumar Meena, Rishij Kumar Agrawal & Ashutosh Mani (2016) Bioinformatics-based study on prokaryotic, archaeal and eukaryotic nucleic acid-binding proteins for identification of lowcomplexity and intrinsically disordered regions, Frontiers in Life Science, 9:1, 2-16, DOI: / To link to this article: Taylor & Francis Published online: 04 Sep Submit your article to this journal Article views: 227 View related articles View Crossmark data Full Terms & Conditions of access and use can be found at Download by: [ ] Date: 02 January 2018, At: 07:33

2 FRONTIERS IN LIFE SCIENCE, 2016 VOL. 9, NO. 1, Bioinformatics-based study on prokaryotic, archaeal and eukaryotic nucleic acid-binding proteins for identification of low-complexity and intrinsically disordered regions Birendra Singh Yadav a, Swati Singh b, Prashant Kumar a, Devika Mathur a, Raj Kumar Meena a, Rishij Kumar Agrawal a and Ashutosh Mani a a Department of Biotechnology, Motilal Nehru National Institute of Technology Allahabad, Allahabad , India; b Center of Bioinformatics, University of Allahabad, Allahabad , India ABSTRACT Intrinsically disordered regions (IDRs) and low-complexity regions (LCRs) in transcription factors (TFs) are known to be key players in various cellular functions. Conformational flexibility of these regions allows them to recognize and interact with a large number of molecules. Previous studies show that certain TFs which are related to environmental response are significantly enriched in IDRs and LCRs. It has been proposed that all organisms in response to environmental conditions use these IDRs and LCRs for introducing versatility in the interactions in biological processes to quickly adapt and respond to challenging environmental conditions. A comparative study has been conducted on these regions to measure the average abundance of LCRs and IDRs in different types of TFs. In this project we have identified the IDRs and LCRs in prokaryotic, eukaryotic and archaeal TFs by using bioinformatics and compared them for average density of IDRs and LCRs. Introduction Low-complexity regions Particular segments of protein sequences having biased amino acid composition, frequently called lowcomplexity regions (LCRs), are abundant in the protein universe (Coletta et al. 2010). The degree of low complexity may be dependent on loosely clustered, irregularly spaced or periodically prevalent aminoacidspresentwithalimitedquantityofdiversity (DePristo et al. 2006). A number of studies have revealed that LCRs exhibit significant divergence across protein families and the genetic mechanisms from which they arise provide them remarkable degrees of compositional plasticity (Coletta et al. 2010). Evolution of LCRs is a rapid process caused by highly dynamic diversifications and high level of inter species variations (Marcotte et al. 1999; Ekman et al. 2006).Theseregionsarestructurallyandfunctionally neutral, with a high probability of fixation (Ekman et al. 2006). It has been proposed that construction of repeats is a common source of genetic variation among prokaryotes to generate novel surface antigens ARTICLE HISTORY Received 8 September 2014 Accepted 19 July 2015 KEYWORDS Low-complexity regions; intrinsically disordered regions; archaea and adapt to swiftly evolving environments (Moxon et al. 1994;Verstrepenetal.2005; Coletta et al. 2010). The flexibility of LCRs is thought to be accountable for their adaptable binding capabilities. This flexibility allows LCRs to bind with adverse classes of structurally variable targets (Dyson & Wright 2005). The effect of presence of repeats on the structure of the host protein depends on their amino acid composition, though LCRs have a tendency to form loops or disordered structure. LCRs are particularly frequent in transcription factors (TFs) and it has been shown that variations in length of particular single amino acid repeat such as glutamine, proline or alanine can result in transcriptional activity of the protein. Statistical analyses have revealed that approximately one-quarter of the amino acids are present in LCRs and generally more than half oftheproteinshaveatleastonelcr(wootton1994). Intrinsically disordered regions An intrinsically disordered region (IDR) in protein lacks a definite or ordered three-dimensional CONTACT Ashutosh Mani amani@mnnit.ac.in 2015 Taylor & Francis

3 FRONTIERS IN LIFE SCIENCE 3 structure. IDRs cover a spectrum of shapes from fully unstructured to partially structured and include random coils, (pre-) molten globules, and large multi-domain proteins connected by flexible linkers. IDRs lack stable structure while in some cases, they can acquire a fixed three-dimensional structure (a) Figure 1. (a) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to ligases and their homologs. (b) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to primases and their homologs. (c) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to ribonucleases and their homologs. (d) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to RNA polymerase II and their homologs. (e) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to RNA polymerases and their homologs. (f) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to DNA topoisomerases and their homologs.

4 4 B.S. YADAV ET AL. (b) Figure 1. Continued. after binding to their target (Dyson & Wright 2005; Tompa 2009; Uversky and Dunker 2010). The binding affinity of many disordered proteins with their receptors is regulated by post-translational modification and the flexibility of disordered proteins facilitates the different conformational requisites for binding the modifying enzymes as well as their targets. It has been reported that the proteins involved in binding with other macromolecules, regulatory and signalling functions inside the cell such as TFs and signalling proteins, have higher extent of IDRs (Ward et al. 2004; Dyson & Wright 2005;Tompa2009; Uversky & Dunker 2010). Absence of a fixed three-dimensional structure in aproteinhasseveraladvantages(gunasekaranetal. 2003; Dyson & Wright 2005;Diellaetal.2008;Gsponer &Babu2009; Tompa2009; Uversky & Dunker 2010).

5 FRONTIERS IN LIFE SCIENCE 5 Figure 1. Continued. First of all, the unstructured region provides a larger surface area for interaction in comparison to globular proteins of a similar length, and second they provide conformational flexibility, and the exposure of short linear peptide motifs and interaction-prone structural motifs, that is, Molecular Recognition Features, allow intrinsically disordered proteins to scaffold and interact with several other proteins. Finally, diverse post-translational modifications in these proteins facilitate the regulation of their function and stabilityinacell.inaddition,asmanyintrinsicallydisordered proteins fold upon binding, they frequently interact with their targets with relatively high specificity. But their affinity for their targets remains low. This specific mechanism of action, by which IDRs associate rapidly with their target in order to initiate

6 6 B.S. YADAV ET AL. (d) Figure 1. Continued. a signalling process and at the same time dissociate easily when the task is accomplished, is functionally significant (Dyson & Wright 2005). Archaeal transcriptional machinery Life on the earth is present in the form of three cellular domains, namely prokaryotes, eukaryotes and archaea (Woese 1998), and is composed of organisms highly diverse in morphology, physiology and natural habitats (Chaban et al. 2006; Clementinoetal.2007; Nam et al. 2008; Auguet et al. 2009). Archaea constitute one of the important cellular domains of life and the organisms included in this cellular domain possess basal transcription machinery resembling that of eukaryotes. Some of the eukaryotic-type machineries that archaea possess include TATA box promoter sequence,

7 FRONTIERS IN LIFE SCIENCE 7 (e) Figure 1. Continued. a TATA box-binding protein (TBP), a homolog of the TF TFIIB (TFB) and an RNA polymerase (RNAp) containing between 8 and 13 subunits (Goede et al. 2006; Rueda & Janga 2010). The archaeal TFs are significantly smaller in comparison to other proteins in archaea as well as bacterial TFs, suggesting that a large number of these small-sized TFs could compensate the probablescarcityoftfsinarchaea,bypossiblyforming different combinations of monomers similar to that observed in eukaryotic transcriptional machinery (Rueda & Janga 2010). The study by Bin and co-workers proposes that archaeal proteins are rich in intrinsic disorder which evolve to help archaea

8 8 B.S. YADAV ET AL. (f) Figure 1. Continued. toaccommodatetotheirhostilehabitats(xueetal. 2010). The aim of this study was to identify LCRs and IDRs in archaeal, bacterial and eukaryotic TFs and compare them with their homologs from other life forms. To identify LCRs, SEG server (Wootton 1994) has been used while DISMETA has been used to identify IDRs. Materials and methods Sequence search Initially 50 TF families were selected for the study and finally 15 sequences of archaeal TFs families were foundtohavehomologsinatleastoneprokaryote or eukaryote. These sequences were retrieved from ArchaeaTF database (Wu et al. 2008). Their closest

9 FRONTIERS IN LIFE SCIENCE 9 (a) (b) (c) (d) (e) (f) Figure 2. (a) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to FIC and their homologs. (b) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to FUR and their homologs. (c) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to GntR and their homologs. (d) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to LacI and their homologs. (e) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to LexA and their homologs. (f) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to LysR and their homologs. (g) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to MarR and their homologs. (h) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to PadR and their homologs. (i) The phylogenetic tree represents evolutionary relationship among 30 organisms from prokaryotes, eukaryotes and archaea with respect to PspC and their homologs.

10 10 B.S. YADAV ET AL. (g) (h) (i) Figure 2. Continued. homologs, available in public domain, from bacteria and eukaryotes were searched by using BLAST of NCBI (Altschul et al. 1997). Identification of LCRs A total of 180 amino acid sequences belonging to six nucleic acid-interacting protein families from 30 species (10 from each) of prokaryotes, eukaryotes and archaea were selected for the identification of LCRs (see Tables 1 6 in online supplemental data which is available from the article s Taylor & Francis Online page at / ). SEG server was used for LCR identification (Wootton 1994). For each sequence of any species, the overlapping regions of LCRs were identified to avoid repetitions in average calculation. The length of LCR was counted and average percentage of LCR was calculated for the whole length of amino acid sequence. Finally for the calculation of average density of LCRs in archaeal sequences, all the average percentage of LCR for individual sequence belonging to archaea were summed and their average was calculated. This method was also repeated for prokaryotic and eukaryotic sequences. Identification of disordered regions A total of 63 amino acid sequences belonging to 9 nucleic acid-interacting protein families from 32 speciesofprokaryotes,eukaryotesandarchaeawere selected for the identification of IDRs (see Tables 7 15 in supplemental data). DISMETA (Huang et al. 2014) server was used for IDR identification. The length of IDR was counted and average percentage of LCR was calculated for the whole length of amino acid sequence. Finally, for the calculation of average density of LCRs in archaeal sequences all the average percentage of IDR for individual sequence belonging to archaea were summed and their average was calculated. This method was repeated for the calculation of average density of IDRs in prokaryotic and eukaryotic sequences. Multiple sequence alignment and phylogenetic tree Multiple sequence alignment and phylogenetic tree construction were performed by using MEGA 6.06

11 FRONTIERS IN LIFE SCIENCE 11 (a) (b) (c) Figure 3. (a) Composition of sequences based on physicochemical property of amino acids present in LCRs of archaea. Numerical value represents the percentage of specific amino acid in the LCR region of archaea. (b) Composition of sequences based on physicochemical property of amino acids present in LCRs of eukaryotes. Numerical value represents the percentage of specific amino acid in the LCR region of eukaryotes. (c) Composition of sequences based on physicochemical property of amino acids present in LCRs of prokaryotes. Numerical value represents the percentage of specific amino acid in the LCR region of prokaryotes.

12 12 B.S. YADAV ET AL. (a) (b) (c) Figure 4. (a) Composition of sequences based on physicochemical property of amino acids present in IDRs of archaea. (b) Composition of sequences based on physicochemical property of amino acids present in IDRs of prokaryotes. (c) Composition of sequences based on physicochemical property of amino acids present in IDRs of eukaryotes.

13 FRONTIERS IN LIFE SCIENCE 13 (Tamura et al. 2013) software. All the homologous sequences were first aligned and then manually curated. Maximum likelihood method of construction was used for tree construction for sequences used in the identification of LCRs (Figure 1(a) (f)) and IDRs (Figure 2(a) (i)). Physicochemical property studies Forassessingthephysicochemicalpropertiesofthe residues involved in the construction of LCRs, COPID server was used ( copid/help.html). Percentage of small, polar and chargedresiduesintheseregionswascalculatedfor sequencesfallingintheregionoflcrs(figure3(a) (c)) and IDRs (Figure 4(a) (c)). Phosphorylation site prediction NetPhos 2.0 (Blom et al. 1999) server was used for prediction of phosphorylation sites in protein (a) sequences. Phosphorylation sites in disordered regions were predicted separately. The prediction included phosphorylation sites of tyrosine serine and threonine residues. Results and discussion The amino acid composition studies revealed that prokaryotic, eukaryotic and archaeal LCRs are having highest percentage of small residues followed by polar residues while in the IDRs of archaea small residues and neutral residues are leading type of amino acids. Eukaryotic IDRs follow the same pattern as in case of LCRs. Phosphorylation site studies revealed that in case of eukaryotes probability of occurrence of a serine, tyrosine or threonine phosphorylation site is lower (i.e. 7.3%) in comparison to full length sequence where the occurrence of phosphorylation site was foundtobe8.0%.onthesamepatternthesevalues were 7.99 and 5.9 for archaea. In previous reports intrinsically disordered proteins have been found (b) Figure 5. (a) Average density of LCRs in prokaryotes, eukaryotes and archaea for primases, DNA ligases, ribonucleases and their homologs. (b) Average density of LCRs in prokaryotes, eukaryotes and archaea for RNA polymerase II, RNA polymerase, topoisomerases and their homologs.

14 14 B.S. YADAV ET AL. to be substantially enriched in polar and disorderpromoting amino acids (Ala, Arg, Gly, Gln, Ser, Glu and Lys) as well as Pro residues that are hydrophobic in nature and have significant role in structure-breaking (Dunker et al. 2001; Romeroetal.2001; Williams et al. 2001). It is pertinent to mention here that these biases in the amino acid compositions of IDPs are also consistent with the low overall hydrophobicity and high net charge (Uversky et al. 2000). In the present study, it was observed that prokaryotic IDRs are high in small, neutral and hydrophobic residues while in eukaryotes these regions are (a) (b) (c) Figure 6. (a) Average density of IDRs in prokaryotes, eukaryotes and archaea for FUR proteins, GntR proteins, FIC proteins and their homologs. (b) Average density of IDRs in prokaryotes, eukaryotes and archaea for Lac I proteins, Lex A proteins, PadR proteins and their homologs. (c) Average density of IDRs in prokaryotes, eukaryotes and archaea for MarR I proteins, PspC A proteins, LysR proteins and their homologs.

15 FRONTIERS IN LIFE SCIENCE 15 rich in small, polar and neutral residues followed by hydrophobic residues. Archaeal IDRs were reported to be rich in small, polar and hydrophobic residues (Figure 4(a) (c)). Archaea have higher average percentage density of LCRs The average density of LCRs for ribonucleases, DNA ligase, DNA primase, DNA topoisomerase and RNA polymeraseiwasfoundtobehigherthanprokaryotes (Figure 5(a) and (b)). In all these cases, except for RNA polymerase II, these LCR densities are even higher than their homologs in eukaryotes. However, forribonucleaseproteintheaveragedensityoflcrs in prokaryotes is higher than eukaryotes. Finally, the averagedensityoflcrsinprokaryotic,eukaryoticand archaeal sequences is , and 19.25, respectively. Thus the archaeal transcriptional machinery has the highest density of LCRs in the proteins used in this study. For the proteins that were in life form but not present in any other life form, their closest homologs obtained by BLAST search in NCBI were used as representative of the protein. Archaeal TFs are generally not rich in terms of IDRs The average density of IDRs for FUR, GntR, LexA, MarR and LysR is higher than prokaryotes and eukaryotes (Figure 6(a) (c)). While in the case of FIC and PspC, it is higher than prokaryote and eukaryote, respectively. For PadR and LacI, the average density ofarchaealidrsislowerthanbothprokaryotesand eukaryotes. From the above results, it is clear that archaea generally possess higher average density of LCRs while they are not rich in IDRs. Though some proteins such as FUR, GntR, LexA, MarR and LysR have higher density for IDRs, this trend is generally not followed by other proteins such as PadR, FIC, PspC and LacI. All these proteins have higher average density of IDRs in prokaryotes in comparison to archaea. Conclusion The archaeal transcription apparatus is a simplified form of the eukaryotic transcription machinery. It is known that archaea have prokaryotic type of transcriptional machinery, since 53% of archaeal TFs are homologs of bacterial TFs, but they have to work foraneukaryotictypeofproteinsynthesissystem. Despite having relatively smaller TFs that is similar to prokaryotic TFs, archaea possess higher average density of LCRs in comparison to prokaryotes and eukaryotes. However, this correlation with complexity of functions is not true in the case of IDRs where no set trend was observed. Probably LCRs provide them enough flexibility for protein protein interactions for the formation of homo hetero multimers and work in a manner that resembles with eukaryotic-type mechanisms. Acknowledgements BSY and SS analyzed the data and helped in writing the manuscript. PK, DM, RKM and RJ performed sequence retrieval and software runs, and AM conceived and supervised this project. Disclosure statement No potential conflict of interest was reported by the authors. Funding Supplemental data Supplemental data for this article can be accessed here. [doi: / ] References AprojectresearchgrantTEQIP-IItoAMishighlyacknowledged. AltschulSF,MaddenTL,SchafferAA,ZhangJ,MillerW,Lipman DJ Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: Auguet JC, Barberan A, Casamayor EO Global ecological patterns in uncultured archaea. ISME J. 4: Blom N, Gammeltoft S, Brunak S Sequence- and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol. 294(5): Chaban B, Ng SY, Jarrell KF Archaeal habitats from the extreme to the ordinary. Can J Microbiol. 52: Clementino MM, Fernandes CC, Vieira RP, Cardoso AM, Polycarpo CR, Martins OB Archaeal diversity in naturallyoccurringandimpactedenvironmentsfromatropical region. J Appl Microbiol. 103: Coletta A, Pinney JW, Solís DYW, Marsh J, Pettifer SR, Attwood TK Low-complexity regions within protein sequences have position-dependent roles. BMC Syst Biol. 4:43. doi: /

16 16 B.S. YADAV ET AL. DePristo M, Zilversmit M, Hartl D On the abundance, amino acid composition, and evolutionary dynamics of lowcomplexity regions in proteins. Gene. 378: DiellaF,HaslamN,ChicaC,BuddA,MichaelS,BrownNP, Trave G, Gibson TJ Understanding eukaryotic linear motifs and their role in cell signaling and regulation. Front Biosci. 13: Dunker AK, Lawson JD, Brown CJ, Williams RM, Romero P, Oh JS, Oldfield CJ, Campen AM, Ratliff CM, Hipps KW, et al Intrinsically disordered protein. J Mol Graph Model. 19(1): Dyson H, Wright P Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Biol. 6: Ekman D, Light S, Bjorklund A, Elofsson A What properties characterize the hub proteins of the protein-protein interaction network of the protein-protein interaction network of Saccharomyces cerevisiae? Gen Biol. 7(6):R45. doi: /gb r45 Goede B, Naji S, Kampen OV, Ilg K, Thomm M Protein protein interactions in the archaeal transcriptional machinery: binding studies of isolated RNA polymerase subunits and transcription factors. J Biol Chem. 281: Gsponer J, Babu MM The rules of disorder or why disorder rules. Prog Biophys Mol Biol. 99: Gunasekaran K, Tsai CJ, Kumar S, Zanuy D, Nussinov R Extended disordered proteins: targeting function with less scaffold. Trends Biochem Sci. 28: Huang YJ1, Acton TB, Montelione GT DisMeta: a meta server for construct design and optimization. Methods Mol Biol. 1091:3 16. Marcotte E, Pellegrini M, Yeates T, Eisenberg D A census of protein repeats. J Mol Biol. 293: Moxon E, Rainey P, Nowak M, Lenski R Adaptive evolution of highly mutable loci in pathogenic bacteria. Curr Biol. 4: Nam YD, Chang HW, Kim KH, Roh SW, Kim MS, Jung MJ, Lee SW, Kim JY, Yoon JH, Bae JW Bacterial, archaeal, and eukaryal diversity in the intestines of Korean people. J Microbiol. 46: Romero P, Obradovic Z, Li X, Garner EC, Brown CJ, Dunker AK Sequence complexity of disordered protein. Proteins. 42(1): Rueda EP, Janga SC Identification and genomic analysis of transcription factors in archaeal genomes exemplifies their functional architecture and evolutionary origin. Mol Biol Evol. 27(6): Tamura K, Stecher G, Peterson D, Filipski A, Kumar S MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 30: Tompa P Structure and function of intrinsically disordered proteins. 1st ed. Boca Raton: Chapman and Hall/CRC. Uversky VN, Dunker AK Understanding protein nonfolding. Biochim Biophys Acta. 1804: Uversky VN, Gillespie JR, Fink AL Why are natively unfolded proteins unstructured under physiologic conditions? Proteins. 41(3): Verstrepen K, Jansen A, Lewitter F, Fink G Intragenic tandem repeats generate functional variability. Nat Genet. 37(9): Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol. 337: Williams RM, Obradovi Z, Mathura V, Barun W, Garner EC, Young J, Takayama S, Brown CJ, Dunker AK The protein non-folding problem: amino acid determinants of intrinsic order and disorder. Pac Symp Biocomput. 6: Woese C The universal ancestor. Proc Natl Acad Sci USA. 95: Wootton JC Non-globular domains in protein sequences: automated segmentation using complexity measures. Comput Chem. 18(3): Wu J, Wang S, Bai J, Shi L, Li D, Xu Z, Niu Y, Lu J, Bao Q ArchaeaTF: an integrated database of putative transcription factors in archaea. Genomics. 91(1): Xue B, Williams RW, Oldfield CJ, Dunker AK, Uversky VN Archaic chaos: intrinsically disordered proteins in archaea. BMC Syst Biol. 4(Suppl 1): S1. doi: / s1-s1

Keywords: intrinsic disorder, IDP, protein, protein structure, PTM

Keywords: intrinsic disorder, IDP, protein, protein structure, PTM PostDoc Journal Vol.2, No.1, January 2014 Journal of Postdoctoral Research www.postdocjournal.com Protein Structure and Function: Methods for Prediction and Analysis Ravi Ramesh Pathak Morsani College

More information

TREND OF AMINO ACID COMPOSITION OF PROTEINS OF DIFFERENT TAXA

TREND OF AMINO ACID COMPOSITION OF PROTEINS OF DIFFERENT TAXA Journal of Bioinformatics and Computational Biology Vol. 4, No. 2 (2006) 597 608 c Imperial College Press TREND OF AMINO ACID COMPOSITION OF PROTEINS OF DIFFERENT TAXA NATALYA S. BOGATYREVA,ALEXEIV.FINKELSTEIN

More information

Exceptionally abundant exceptions: comprehensive characterization of intrinsic disorder in all domains of life

Exceptionally abundant exceptions: comprehensive characterization of intrinsic disorder in all domains of life Cell. Mol. Life Sci. (2015) 72:137 151 DOI 10.1007/s00018-014-1661-9 Cellular and Molecular Life Sciences RESEARCH ARTICLE Exceptionally abundant exceptions: comprehensive characterization of intrinsic

More information

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA XIUFENG WAN xw6@cs.msstate.edu Department of Computer Science Box 9637 JOHN A. BOYLE jab@ra.msstate.edu Department of Biochemistry and Molecular Biology

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

1-D Predictions. Prediction of local features: Secondary structure & surface exposure

1-D Predictions. Prediction of local features: Secondary structure & surface exposure 1-D Predictions Prediction of local features: Secondary structure & surface exposure 1 Learning Objectives After today s session you should be able to: Explain the meaning and usage of the following local

More information

Lecture 14 - Cells. Astronomy Winter Lecture 14 Cells: The Building Blocks of Life

Lecture 14 - Cells. Astronomy Winter Lecture 14 Cells: The Building Blocks of Life Lecture 14 Cells: The Building Blocks of Life Astronomy 141 Winter 2012 This lecture describes Cells, the basic structural units of all life on Earth. Basic components of cells: carbohydrates, lipids,

More information

Genomics and bioinformatics summary. Finding genes -- computer searches

Genomics and bioinformatics summary. Finding genes -- computer searches Genomics and bioinformatics summary 1. Gene finding: computer searches, cdnas, ESTs, 2. Microarrays 3. Use BLAST to find homologous sequences 4. Multiple sequence alignments (MSAs) 5. Trees quantify sequence

More information

Sequence analysis and comparison

Sequence analysis and comparison The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species

More information

Supplemental Materials

Supplemental Materials JOURNAL OF MICROBIOLOGY & BIOLOGY EDUCATION, May 2013, p. 107-109 DOI: http://dx.doi.org/10.1128/jmbe.v14i1.496 Supplemental Materials for Engaging Students in a Bioinformatics Activity to Introduce Gene

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)

More information

Title: Evolutionary dynamics of the protein structure-function relation and the origin of the genetic code

Title: Evolutionary dynamics of the protein structure-function relation and the origin of the genetic code Blue Waters Illinois Allocation Report 2016 Primary contact email and name: NAME: Dr. Gustavo Caetano-Anolles EMAIL: gca@illinois.edu Title: Evolutionary dynamics of the protein structure-function relation

More information

Lecture 15: Realities of Genome Assembly Protein Sequencing

Lecture 15: Realities of Genome Assembly Protein Sequencing Lecture 15: Realities of Genome Assembly Protein Sequencing Study Chapter 8.10-8.15 1 Euler s Theorems A graph is balanced if for every vertex the number of incoming edges equals to the number of outgoing

More information

CSCE555 Bioinformatics. Protein Function Annotation

CSCE555 Bioinformatics. Protein Function Annotation CSCE555 Bioinformatics Protein Function Annotation Why we need to do function annotation? Fig from: Network-based prediction of protein function. Molecular Systems Biology 3:88. 2007 What s function? The

More information

Sequence Alignments. Dynamic programming approaches, scoring, and significance. Lucy Skrabanek ICB, WMC January 31, 2013

Sequence Alignments. Dynamic programming approaches, scoring, and significance. Lucy Skrabanek ICB, WMC January 31, 2013 Sequence Alignments Dynamic programming approaches, scoring, and significance Lucy Skrabanek ICB, WMC January 31, 213 Sequence alignment Compare two (or more) sequences to: Find regions of conservation

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,

More information

Translation. A ribosome, mrna, and trna.

Translation. A ribosome, mrna, and trna. Translation The basic processes of translation are conserved among prokaryotes and eukaryotes. Prokaryotic Translation A ribosome, mrna, and trna. In the initiation of translation in prokaryotes, the Shine-Dalgarno

More information

Gene regulation I Biochemistry 302. Bob Kelm February 25, 2005

Gene regulation I Biochemistry 302. Bob Kelm February 25, 2005 Gene regulation I Biochemistry 302 Bob Kelm February 25, 2005 Principles of gene regulation (cellular versus molecular level) Extracellular signals Chemical (e.g. hormones, growth factors) Environmental

More information

Welcome to Class 21!

Welcome to Class 21! Welcome to Class 21! Introductory Biochemistry! Lecture 21: Outline and Objectives l Regulation of Gene Expression in Prokaryotes! l transcriptional regulation! l principles! l lac operon! l trp attenuation!

More information

Influence of temperature and diffusive entropy on the capture radius of fly-casting binding

Influence of temperature and diffusive entropy on the capture radius of fly-casting binding . Research Paper. SCIENCE CHINA Physics, Mechanics & Astronomy December 11 Vol. 54 No. 1: 7 4 doi: 1.17/s114-11-4485-8 Influence of temperature and diffusive entropy on the capture radius of fly-casting

More information

Interfacing chemical biology with the -omic sciences and systems biology

Interfacing chemical biology with the -omic sciences and systems biology Volume 12 Number 3 March 2016 Pages 681 1058 Molecular BioSystems Interfacing chemical biology with the -omic sciences and systems biology www.rsc.org/molecularbiosystems ISSN 1742-206X PAPER A. Keith

More information

Tiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1

Tiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1 Tiffany Samaroo MB&B 452a December 8, 2003 Take Home Final Topic 1 Prior to 1970, protein and DNA sequence alignment was limited to visual comparison. This was a very tedious process; even proteins with

More information

3.B.1 Gene Regulation. Gene regulation results in differential gene expression, leading to cell specialization.

3.B.1 Gene Regulation. Gene regulation results in differential gene expression, leading to cell specialization. 3.B.1 Gene Regulation Gene regulation results in differential gene expression, leading to cell specialization. We will focus on gene regulation in prokaryotes first. Gene regulation accounts for some of

More information

Introduction. Gene expression is the combined process of :

Introduction. Gene expression is the combined process of : 1 To know and explain: Regulation of Bacterial Gene Expression Constitutive ( house keeping) vs. Controllable genes OPERON structure and its role in gene regulation Regulation of Eukaryotic Gene Expression

More information

Hiromi Nishida. 1. Introduction. 2. Materials and Methods

Hiromi Nishida. 1. Introduction. 2. Materials and Methods Evolutionary Biology Volume 212, Article ID 342482, 5 pages doi:1.1155/212/342482 Research Article Comparative Analyses of Base Compositions, DNA Sizes, and Dinucleotide Frequency Profiles in Archaeal

More information

This is a repository copy of Microbiology: Mind the gaps in cellular evolution.

This is a repository copy of Microbiology: Mind the gaps in cellular evolution. This is a repository copy of Microbiology: Mind the gaps in cellular evolution. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/114978/ Version: Accepted Version Article:

More information

Introduction to Evolutionary Concepts

Introduction to Evolutionary Concepts Introduction to Evolutionary Concepts and VMD/MultiSeq - Part I Zaida (Zan) Luthey-Schulten Dept. Chemistry, Beckman Institute, Biophysics, Institute of Genomics Biology, & Physics NIH Workshop 2009 VMD/MultiSeq

More information

An Introduction to Sequence Similarity ( Homology ) Searching

An Introduction to Sequence Similarity ( Homology ) Searching An Introduction to Sequence Similarity ( Homology ) Searching Gary D. Stormo 1 UNIT 3.1 1 Washington University, School of Medicine, St. Louis, Missouri ABSTRACT Homologous sequences usually have the same,

More information

Chapter 16 Lecture. Concepts Of Genetics. Tenth Edition. Regulation of Gene Expression in Prokaryotes

Chapter 16 Lecture. Concepts Of Genetics. Tenth Edition. Regulation of Gene Expression in Prokaryotes Chapter 16 Lecture Concepts Of Genetics Tenth Edition Regulation of Gene Expression in Prokaryotes Chapter Contents 16.1 Prokaryotes Regulate Gene Expression in Response to Environmental Conditions 16.2

More information

Protein Architecture V: Evolution, Function & Classification. Lecture 9: Amino acid use units. Caveat: collagen is a. Margaret A. Daugherty.

Protein Architecture V: Evolution, Function & Classification. Lecture 9: Amino acid use units. Caveat: collagen is a. Margaret A. Daugherty. Lecture 9: Protein Architecture V: Evolution, Function & Classification Margaret A. Daugherty Fall 2004 Amino acid use *Proteins don t use aa s equally; eg, most proteins not repeating units. Caveat: collagen

More information

Bioinformatics Chapter 1. Introduction

Bioinformatics Chapter 1. Introduction Bioinformatics Chapter 1. Introduction Outline! Biological Data in Digital Symbol Sequences! Genomes Diversity, Size, and Structure! Proteins and Proteomes! On the Information Content of Biological Sequences!

More information

Exploring Evolution & Bioinformatics

Exploring Evolution & Bioinformatics Chapter 6 Exploring Evolution & Bioinformatics Jane Goodall The human sequence (red) differs from the chimpanzee sequence (blue) in only one amino acid in a protein chain of 153 residues for myoglobin

More information

GCD3033:Cell Biology. Transcription

GCD3033:Cell Biology. Transcription Transcription Transcription: DNA to RNA A) production of complementary strand of DNA B) RNA types C) transcription start/stop signals D) Initiation of eukaryotic gene expression E) transcription factors

More information

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E REVIEW SESSION Wednesday, September 15 5:30 PM SHANTZ 242 E Gene Regulation Gene Regulation Gene expression can be turned on, turned off, turned up or turned down! For example, as test time approaches,

More information

56:198:582 Biological Networks Lecture 8

56:198:582 Biological Networks Lecture 8 56:198:582 Biological Networks Lecture 8 Course organization Two complementary approaches to modeling and understanding biological networks Constraint-based modeling (Palsson) System-wide Metabolism Steady-state

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature17991 Supplementary Discussion Structural comparison with E. coli EmrE The DMT superfamily includes a wide variety of transporters with 4-10 TM segments 1. Since the subfamilies of the

More information

BME 5742 Biosystems Modeling and Control

BME 5742 Biosystems Modeling and Control BME 5742 Biosystems Modeling and Control Lecture 24 Unregulated Gene Expression Model Dr. Zvi Roth (FAU) 1 The genetic material inside a cell, encoded in its DNA, governs the response of a cell to various

More information

Prokaryotic Regulation

Prokaryotic Regulation Prokaryotic Regulation Control of transcription initiation can be: Positive control increases transcription when activators bind DNA Negative control reduces transcription when repressors bind to DNA regulatory

More information

Automatic Epitope Recognition in Proteins Oriented to the System for Macromolecular Interaction Assessment MIAX

Automatic Epitope Recognition in Proteins Oriented to the System for Macromolecular Interaction Assessment MIAX Genome Informatics 12: 113 122 (2001) 113 Automatic Epitope Recognition in Proteins Oriented to the System for Macromolecular Interaction Assessment MIAX Atsushi Yoshimori Carlos A. Del Carpio yosimori@translell.eco.tut.ac.jp

More information

Name: SBI 4U. Gene Expression Quiz. Overall Expectation:

Name: SBI 4U. Gene Expression Quiz. Overall Expectation: Gene Expression Quiz Overall Expectation: - Demonstrate an understanding of concepts related to molecular genetics, and how genetic modification is applied in industry and agriculture Specific Expectation(s):

More information

Gene regulation II Biochemistry 302. February 27, 2006

Gene regulation II Biochemistry 302. February 27, 2006 Gene regulation II Biochemistry 302 February 27, 2006 Molecular basis of inhibition of RNAP by Lac repressor 35 promoter site 10 promoter site CRP/DNA complex 60 Lewis, M. et al. (1996) Science 271:1247

More information

PROTEOMICS. Is the intrinsic disorder of proteins the cause of the scale-free architecture of protein-protein interaction networks?

PROTEOMICS. Is the intrinsic disorder of proteins the cause of the scale-free architecture of protein-protein interaction networks? Is the intrinsic disorder of proteins the cause of the scale-free architecture of protein-protein interaction networks? Journal: Manuscript ID: Wiley - Manuscript type: Date Submitted by the Author: Complete

More information

Comparative genomics: Overview & Tools + MUMmer algorithm

Comparative genomics: Overview & Tools + MUMmer algorithm Comparative genomics: Overview & Tools + MUMmer algorithm Urmila Kulkarni-Kale Bioinformatics Centre University of Pune, Pune 411 007. urmila@bioinfo.ernet.in Genome sequence: Fact file 1995: The first

More information

CHEM 3653 Exam # 1 (03/07/13)

CHEM 3653 Exam # 1 (03/07/13) 1. Using phylogeny all living organisms can be divided into the following domains: A. Bacteria, Eukarya, and Vertebrate B. Archaea and Eukarya C. Bacteria, Eukarya, and Archaea D. Eukarya and Bacteria

More information

UNIT 6 PART 3 *REGULATION USING OPERONS* Hillis Textbook, CH 11

UNIT 6 PART 3 *REGULATION USING OPERONS* Hillis Textbook, CH 11 UNIT 6 PART 3 *REGULATION USING OPERONS* Hillis Textbook, CH 11 REVIEW: Signals that Start and Stop Transcription and Translation BUT, HOW DO CELLS CONTROL WHICH GENES ARE EXPRESSED AND WHEN? First of

More information

CS-E5880 Modeling biological networks Gene regulatory networks

CS-E5880 Modeling biological networks Gene regulatory networks CS-E5880 Modeling biological networks Gene regulatory networks Jukka Intosalmi (based on slides by Harri Lähdesmäki) Department of Computer Science Aalto University January 12, 2018 Outline Modeling gene

More information

Research Proposal. Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family.

Research Proposal. Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family. Research Proposal Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family. Name: Minjal Pancholi Howard University Washington, DC. June 19, 2009 Research

More information

What can sequences tell us?

What can sequences tell us? Bioinformatics What can sequences tell us? AGACCTGAGATAACCGATAC By themselves? Not a heck of a lot...* *Indeed, one of the key results learned from the Human Genome Project is that disease is much more

More information

The architecture of transcription elongation A crystal structure explains how transcription factors enhance elongation and pausing

The architecture of transcription elongation A crystal structure explains how transcription factors enhance elongation and pausing The architecture of transcription elongation A crystal structure explains how transcription factors enhance elongation and pausing By Thomas Fouqueau and Finn Werner The molecular machines that carry out

More information

CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON

CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON PROKARYOTE GENES: E. COLI LAC OPERON CHAPTER 13 CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON Figure 1. Electron micrograph of growing E. coli. Some show the constriction at the location where daughter

More information

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES Protein Structure W. M. Grogan, Ph.D. OBJECTIVES 1. Describe the structure and characteristic properties of typical proteins. 2. List and describe the four levels of structure found in proteins. 3. Relate

More information

Topic 4 - #14 The Lactose Operon

Topic 4 - #14 The Lactose Operon Topic 4 - #14 The Lactose Operon The Lactose Operon The lactose operon is an operon which is responsible for the transport and metabolism of the sugar lactose in E. coli. - Lactose is one of many organic

More information

Lecture 4: Transcription networks basic concepts

Lecture 4: Transcription networks basic concepts Lecture 4: Transcription networks basic concepts - Activators and repressors - Input functions; Logic input functions; Multidimensional input functions - Dynamics and response time 2.1 Introduction The

More information

Bioinformatics. Scoring Matrices. David Gilbert Bioinformatics Research Centre

Bioinformatics. Scoring Matrices. David Gilbert Bioinformatics Research Centre Bioinformatics Scoring Matrices David Gilbert Bioinformatics Research Centre www.brc.dcs.gla.ac.uk Department of Computing Science, University of Glasgow Learning Objectives To explain the requirement

More information

Microbial Taxonomy and the Evolution of Diversity

Microbial Taxonomy and the Evolution of Diversity 19 Microbial Taxonomy and the Evolution of Diversity Copyright McGraw-Hill Global Education Holdings, LLC. Permission required for reproduction or display. 1 Taxonomy Introduction to Microbial Taxonomy

More information

Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences

Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic

More information

We used the PSI-BLAST program (http://www.ncbi.nlm.nih.gov/blast/) to search the

We used the PSI-BLAST program (http://www.ncbi.nlm.nih.gov/blast/) to search the SUPPLEMENTARY METHODS - in silico protein analysis We used the PSI-BLAST program (http://www.ncbi.nlm.nih.gov/blast/) to search the Protein Data Bank (PDB, http://www.rcsb.org/pdb/) and the NCBI non-redundant

More information

Computational Cell Biology Lecture 4

Computational Cell Biology Lecture 4 Computational Cell Biology Lecture 4 Case Study: Basic Modeling in Gene Expression Yang Cao Department of Computer Science DNA Structure and Base Pair Gene Expression Gene is just a small part of DNA.

More information

Regulation of Gene Expression

Regulation of Gene Expression Chapter 18 Regulation of Gene Expression Edited by Shawn Lester PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley

More information

Introduction to Bioinformatics Integrated Science, 11/9/05

Introduction to Bioinformatics Integrated Science, 11/9/05 1 Introduction to Bioinformatics Integrated Science, 11/9/05 Morris Levy Biological Sciences Research: Evolutionary Ecology, Plant- Fungal Pathogen Interactions Coordinator: BIOL 495S/CS490B/STAT490B Introduction

More information

Flow of Genetic Information

Flow of Genetic Information presents Flow of Genetic Information A Montagud E Navarro P Fernández de Córdoba JF Urchueguía Elements Nucleic acid DNA RNA building block structure & organization genome building block types Amino acid

More information

Prokaryotic Gene Expression (Learning Objectives)

Prokaryotic Gene Expression (Learning Objectives) Prokaryotic Gene Expression (Learning Objectives) 1. Learn how bacteria respond to changes of metabolites in their environment: short-term and longer-term. 2. Compare and contrast transcriptional control

More information

15.2 Prokaryotic Transcription *

15.2 Prokaryotic Transcription * OpenStax-CNX module: m52697 1 15.2 Prokaryotic Transcription * Shannon McDermott Based on Prokaryotic Transcription by OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons

More information

Chapter 19. Microbial Taxonomy

Chapter 19. Microbial Taxonomy Chapter 19 Microbial Taxonomy 12-17-2008 Taxonomy science of biological classification consists of three separate but interrelated parts classification arrangement of organisms into groups (taxa; s.,taxon)

More information

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS Int. J. LifeSc. Bt & Pharm. Res. 2012 Kaladhar, 2012 Research Paper ISSN 2250-3137 www.ijlbpr.com Vol.1, Issue. 1, January 2012 2012 IJLBPR. All Rights Reserved PROTEIN SECONDARY STRUCTURE PREDICTION:

More information

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Introduction to Comparative Protein Modeling. Chapter 4 Part I Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature

More information

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Bioinformatics. Dept. of Computational Biology & Bioinformatics Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS

More information

Regulation of gene Expression in Prokaryotes & Eukaryotes

Regulation of gene Expression in Prokaryotes & Eukaryotes Regulation of gene Expression in Prokaryotes & Eukaryotes 1 The trp Operon Contains 5 genes coding for proteins (enzymes) required for the synthesis of the amino acid tryptophan. Also contains a promoter

More information

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega BLAST Multiple Sequence Alignments: Clustal Omega What does basic BLAST do (e.g. what is input sequence and how does BLAST look for matches?) Susan Parrish McDaniel College Multiple Sequence Alignments

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Chapter 20. Initiation of transcription. Eukaryotic transcription initiation

Chapter 20. Initiation of transcription. Eukaryotic transcription initiation Chapter 20. Initiation of transcription Eukaryotic transcription initiation 2003. 5.22 Prokaryotic vs eukaryotic Bacteria = one RNA polymerase Eukaryotes have three RNA polymerases (I, II, and III) in

More information

Supplemental Data. Perea-Resa et al. Plant Cell. (2012) /tpc

Supplemental Data. Perea-Resa et al. Plant Cell. (2012) /tpc Supplemental Data. Perea-Resa et al. Plant Cell. (22)..5/tpc.2.3697 Sm Sm2 Supplemental Figure. Sequence alignment of Arabidopsis LSM proteins. Alignment of the eleven Arabidopsis LSM proteins. Sm and

More information

DATE A DAtabase of TIM Barrel Enzymes

DATE A DAtabase of TIM Barrel Enzymes DATE A DAtabase of TIM Barrel Enzymes 2 2.1 Introduction.. 2.2 Objective and salient features of the database 2.2.1 Choice of the dataset.. 2.3 Statistical information on the database.. 2.4 Features....

More information

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting. Genome Annotation Bioinformatics and Computational Biology Genome Annotation Frank Oliver Glöckner 1 Genome Analysis Roadmap Genome sequencing Assembly Gene prediction Protein targeting trna prediction

More information

Amino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1

Amino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1 Amino Acid Structures from Klug & Cummings 10/7/2003 CAP/CGS 5991: Lecture 7 1 Amino Acid Structures from Klug & Cummings 10/7/2003 CAP/CGS 5991: Lecture 7 2 Amino Acid Structures from Klug & Cummings

More information

Three types of RNA polymerase in eukaryotic nuclei

Three types of RNA polymerase in eukaryotic nuclei Three types of RNA polymerase in eukaryotic nuclei Type Location RNA synthesized Effect of α-amanitin I Nucleolus Pre-rRNA for 18,.8 and 8S rrnas Insensitive II Nucleoplasm Pre-mRNA, some snrnas Sensitive

More information

Regulation of gene expression. Premedical - Biology

Regulation of gene expression. Premedical - Biology Regulation of gene expression Premedical - Biology Regulation of gene expression in prokaryotic cell Operon units system of negative feedback positive and negative regulation in eukaryotic cell - at any

More information

The Minimal-Gene-Set -Kapil PHY498BIO, HW 3

The Minimal-Gene-Set -Kapil PHY498BIO, HW 3 The Minimal-Gene-Set -Kapil Rajaraman(rajaramn@uiuc.edu) PHY498BIO, HW 3 The number of genes in organisms varies from around 480 (for parasitic bacterium Mycoplasma genitalium) to the order of 100,000

More information

Tools and Algorithms in Bioinformatics

Tools and Algorithms in Bioinformatics Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week-4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and

More information

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution Taxonomy Content Why Taxonomy? How to determine & classify a species Domains versus Kingdoms Phylogeny and evolution Why Taxonomy? Classification Arrangement in groups or taxa (taxon = group) Nomenclature

More information

Chapter 1. Topic: Overview of basic principles

Chapter 1. Topic: Overview of basic principles Chapter 1 Topic: Overview of basic principles Four major themes of biochemistry I. What are living organism made from? II. How do organism acquire and use energy? III. How does an organism maintain its

More information

MiGA: The Microbial Genome Atlas

MiGA: The Microbial Genome Atlas December 12 th 2017 MiGA: The Microbial Genome Atlas Jim Cole Center for Microbial Ecology Dept. of Plant, Soil & Microbial Sciences Michigan State University East Lansing, Michigan U.S.A. Where I m From

More information

Sequences, Structures, and Gene Regulatory Networks

Sequences, Structures, and Gene Regulatory Networks Sequences, Structures, and Gene Regulatory Networks Learning Outcomes After this class, you will Understand gene expression and protein structure in more detail Appreciate why biologists like to align

More information

Protein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.

Protein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki. Protein Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet rickard.sandberg@ki.se sandberg.cmb.ki.se Outline Protein features motifs patterns profiles signals 2 Protein

More information

Lecture 10: Cyclins, cyclin kinases and cell division

Lecture 10: Cyclins, cyclin kinases and cell division Chem*3560 Lecture 10: Cyclins, cyclin kinases and cell division The eukaryotic cell cycle Actively growing mammalian cells divide roughly every 24 hours, and follow a precise sequence of events know as

More information

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid. 1. A change that makes a polypeptide defective has been discovered in its amino acid sequence. The normal and defective amino acid sequences are shown below. Researchers are attempting to reproduce the

More information

13.4 Gene Regulation and Expression

13.4 Gene Regulation and Expression 13.4 Gene Regulation and Expression Lesson Objectives Describe gene regulation in prokaryotes. Explain how most eukaryotic genes are regulated. Relate gene regulation to development in multicellular organisms.

More information

Gene Regulation and Expression

Gene Regulation and Expression THINK ABOUT IT Think of a library filled with how-to books. Would you ever need to use all of those books at the same time? Of course not. Now picture a tiny bacterium that contains more than 4000 genes.

More information

Chapter 26 Phylogeny and the Tree of Life

Chapter 26 Phylogeny and the Tree of Life Chapter 26 Phylogeny and the Tree of Life Chapter focus Shifting from the process of how evolution works to the pattern evolution produces over time. Phylogeny Phylon = tribe, geny = genesis or origin

More information

Introduction to Bioinformatics

Introduction to Bioinformatics CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics

More information

Chapter 15 Active Reading Guide Regulation of Gene Expression

Chapter 15 Active Reading Guide Regulation of Gene Expression Name: AP Biology Mr. Croft Chapter 15 Active Reading Guide Regulation of Gene Expression The overview for Chapter 15 introduces the idea that while all cells of an organism have all genes in the genome,

More information

Computational methods for predicting protein-protein interactions

Computational methods for predicting protein-protein interactions Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational

More information

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years. Structure Determination and Sequence Analysis The vast majority of the experimentally determined three-dimensional protein structures have been solved by one of two methods: X-ray diffraction and Nuclear

More information

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: m Eukaryotic mrna processing Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: Cap structure a modified guanine base is added to the 5 end. Poly-A tail

More information

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION AND CALIBRATION Calculation of turn and beta intrinsic propensities. A statistical analysis of a protein structure

More information

Warm-Up. Explain how a secondary messenger is activated, and how this affects gene expression. (LO 3.22)

Warm-Up. Explain how a secondary messenger is activated, and how this affects gene expression. (LO 3.22) Warm-Up Explain how a secondary messenger is activated, and how this affects gene expression. (LO 3.22) Yesterday s Picture The first cell on Earth (approx. 3.5 billion years ago) was simple and prokaryotic,

More information

RNA Synthesis and Processing

RNA Synthesis and Processing RNA Synthesis and Processing Introduction Regulation of gene expression allows cells to adapt to environmental changes and is responsible for the distinct activities of the differentiated cell types that

More information

networks in molecular biology Wolfgang Huber

networks in molecular biology Wolfgang Huber networks in molecular biology Wolfgang Huber networks in molecular biology Regulatory networks: components = gene products interactions = regulation of transcription, translation, phosphorylation... Metabolic

More information

Genetic transcription and regulation

Genetic transcription and regulation Genetic transcription and regulation Central dogma of biology DNA codes for DNA DNA codes for RNA RNA codes for proteins not surprisingly, many points for regulation of the process DNA codes for DNA replication

More information

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17.

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17. Genetic Variation: The genetic substrate for natural selection What about organisms that do not have sexual reproduction? Horizontal Gene Transfer Dr. Carol E. Lee, University of Wisconsin In prokaryotes:

More information