2 Spial. Chapter 1. Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6. Pathway level. Atomic level. Cellular level. Proteome level.

Size: px
Start display at page:

Download "2 Spial. Chapter 1. Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6. Pathway level. Atomic level. Cellular level. Proteome level."

Transcription

1 2 Spial Chapter Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Spial Quorum sensing Chemogenomics Descriptor relationships Introduction Conclusions and perspectives Atomic level Pathway level Proteome level Cellular level

2

3 Chapter 2: Spial 2- Contents of chapter 2 Contents of chapter 2... Summary of chapter Introduction... 3 Underlying assumptions... 4 On the shoulders of giants... 4 Scorecons... 5 Sequence logos... 5 ConSurf... 6 The evolutionary trace method... 6 Pairwise HMM logos... 6 Two Sample Logo... 6 The Spial algorithm... 7 Examples of use... 9 Example : The dimerisation interface of STAT5a... Example 2: Differences in co-ordination of retinal between vertebrate and cephalopod rhodopsin... Conclusions... 2 References... 3

4 Chapter 2: Spial 2-2 Summary of chapter 2 Spial (Specificity in alignments) is a tool for the comparative analysis of two alignments of related sequences that differ in their annotation, such as two receptor subtypes. It highlights functionally important residues that are specific to one but not both of the two alignments and visualises this information in three complementary ways: by colour-coding alignment positions, by sequence logos, or by colour-coding the residues of a protein structure provided by the user. The aim of Spial is to assist the identification of residues that are specific to one of the two alignments but not to the other. This can aid in the identification of residues that are involved in the alignment-specific interaction with a small molecule, other proteins, or nucleic acids. Alternatively, Spial may be used to identify residues that are the target of posttranslational modifications in one of the two alignments but not in the other. Parts of this chapter will appear in the following peer-reviewed articles: Wuster, A., G. F. Schertler and M. Madan Babu (29). Spial: Analysis of subtypespecific features in multiple sequence alignments of proteins Bioinformatics (in review). I could not have been able to create Spial without the help of Gebhard Schertler, MRC Laboratory of Molecular Biology. Spial can be accessed on:

5 Chapter 2: Spial 2-3 Introduction Identifying protein residues that are associated with specific functions is a recurring task in molecular biology. To assist this, I have developed Spial, a web-based tool that allows the comparative analysis of two related protein subtypes. Spial differs from similar tools by allowing the identification of residues that are specific to one of the two subtypes but not the other. For example, when comparing the alignments of two related receptor subtypes that bind two different small molecules, Spial allows the identification of those residues that are specific to the binding of each of those ligands. For this, Spial takes two related sequence alignments as an input (figure 2-) and assigns each residue to one of eight possible types, depending on whether it is specific to the first alignment, the second alignment, the consensus, or any combination of those (figure 2-2) Figure 2-. Screenshot of the Spial submission page, as displayed in the Firefox Browser. The user provides two alignments (A and B) by either entering the into the text boxes or by uploading them from her hard disk. Several parameters, such as the consensus threshold or the specificity threshold can be amended. If a protein structure is available, the user can upload it from either the Protein Database (PDB) or from her hard disk, so that Spial s results can be mapped to it. The Spial submission page can be accessed on

6 Chapter 2: Spial 2-4 Underlying assumptions Proteins can be grouped into protein families, whose members all have a common ancestor. The three-dimensional structures of proteins within a family tend to be more similar to each other than their amino acid sequences. Even if amino acid sequence identity falls to 25%, the carbon backbone of the protein tends to follow a common fold within 2 Å (Alberts, Johnson et al. 22). Because proteins within a family tend to have similar amino acid sequences, they can be aligned. A multiple sequence alignment is a way of writing out related amino acid sequences below each other so that the amino acid in one protein is positioned directly above or below the corresponding amino acid in the other protein. By using such a multiple sequence alignment, it is possible to distinguish between conserved and variable positions (columns) in the alignment. Conserved positions are those where the same amino acid can be found in most sequences, whilst variable positions are those that have a different amino acid in most sequences. It is frequently assumed that it is possible to make conclusions about the relative functional importance of residues by considering whether they are conserved or variable. This is because the more important a residue is, the more deleterious a change in such a residue is likely to be. Deleterious mutations are more likely to be eliminated by natural selection. Therefore, functionally important residues will be more conserved. The possible reasons why a residue is conserved can be that it is important for maintaining the protein s structure, that it is part of the active site, or that it is involved in the binding of a small molecule, of nucleic acids, or of another protein. On the shoulders of giants Spial is not the first computational tool that uses sequence conservation to make conclusions about functionally important sites. In the following, I introduce the most important precursors and inspirations for Spial (table 2-) and discuss how they differ from Spial.

7 Chapter 2: Spial 2-5 Name and server URL Reference Description Scorecons (Valdar 22) alignment. Weblogo ConSurf Evolutionary Trace Analysis Pairwise HMM logos Two Sample Logo (Crooks, Hon et al. 24) (Armon, Graur et al. 2) (Lichtarge, Bourne et al. 996; Mihalek, Res et al. 26) (Schuster-Böckler and Bateman 25) (Vacic, Iakoucheva et al. 26) Scorecons calculates the degree of amino acid variability in each column of a multiple sequence The user uploads a multiple sequence alignment and Weblogo returns a sequence logo (Schneider and Stephens 99). The user uploads a multiple sequence alignment and a protein structure. ConSurf computes how conserved each position in the alignment is and colours it accordingly in the protein structure. The aim of Evolutionary Trace Analysis is to identify evolutionarily important residues in a multiple sequence alignment. After defining subgroups in a phylogenetic tree computed from the alignment, Evolutionary Trace Analysis defines those residues as trace residues that are conserved within subgroups but not necessarily between subgroups. Given two multiple sequence alignments, this tool generates a logo for each and returns them in such a way that the positions in each alignment correspond. With Two Sample Logo, the user can visualise the differences between two multiple sequence alignments. Two Sample Logo calculates how significantly the amino acids differ between the two alignments in each position. Table 2-. Computational tools that have inspired Spial or with functionality that is comparable to Spial. Scorecons Scorecons (Valdar 22) is a tool that takes a multiple sequence alignment as an input and returns the conservation of each position in the alignment in text format. The user can select what the measure of conservation is, the default being a method that refers to the variability of each alignment position. Sequence logos Sequence logos (Schneider and Stephens 99) are a way of displaying the conservation of each position in an alignment in an intuitive way. The amino acids that occur at each position are written in such a way that their relative heights correspond to their prevalence at this position. Weblogo (Crooks, Hon et al. 24) is a tool for generating such sequence logos, and Spial makes use of it.

8 Chapter 2: Spial 2-6 ConSurf Consurf (Armon, Graur et al. 2; Glaser, Pupko et al. 23; Landau, Mayrose et al. 25) is similar to Scorecons in the sense that it takes a multiple sequence alignment as an input and supplies information about the conservation of each position in the alignment as an output. However, ConSurf also takes a protein structure as input. The way in which it supplies the information about the conservation of alignment position is by colour-coding each amino acid in the protein accordingly. This way, it is possible to identify conserved surface patches whose amino acids are located at different positions of the alignment but come together in the structure due to protein folding. The evolutionary trace method The evolutionary trace method (Lichtarge, Bourne et al. 996) can be implemented using the Evolutionary trace report_maker (Mihalek, Res et al. 26). This tool takes a protein structure as input and then automatically identifies associated sequence data. Based on this, it then creates a report on the amount of selection pressure over evolutionary time on each residue in the structure. The Evolutionary trace report_maker does this by first constructing a phylogenetic tree from a multiple sequence alignment of the sequence data. For each node in the tree, a consensus sequence is then computed, and the variability of the consensus sequences is then mapped onto the protein structure. Pairwise HMM logos Pair-wise hidden Markov model (HMM) logos are a tool for visualising and comparing the logos of two protein family subgroups in an intuitive way. Unlike Spial, it automatically decides which positions in the two subgroups correspond. One advantage of the web server provided by the authors ( is that although the user can upload her own alignments, this is not strictly necessary, as it is alternatively possible to use the alignments available in the protein family database Pfam (Finn, Mistry et al. 26). Two Sample Logo Of all the tools reviewed here, Two Sample Logo (Vacic, Iakoucheva et al. 26) is the one that is most similar to Spial. Like Spial, it takes two alignments of related protein family subgroups as input. As an output, it supplies two logos on top of each other. Residues that are specific to the first alignment appear in the top logo, residues that are specific to the second alignment appear in the bottom logo, and

9 Chapter 2: Spial 2-7 consensus residues that are specific to both alignments appear in a line between the bottom and the top logo. A functionality that is offered by Two Sample Logo but not by Spial is the computation of a p-value for each position in the input alignments using a t-test or a binomial test. Both tests estimate the p-value of the null hypothesis that both input alignments were generated by the same distribution. The reason why Spial does not calculate a p-value is that I would like to argue that p-values are meaningless in this context. The reason is related to phylogenetic non-independence (Felsenstein 985), which states that different branches of a phylogeny cannot be considered as independent data points as they are related by descent. In the case of Two Sample Logo, two sequences in the alignments provided by the user may simply have the same amino acid in the same position because they are closely related. Because a t- test requires that all observations are independent, it is not appropriate to apply it in this case. The Spial algorithm For the input alignments, both the FASTA and SELEX alignment formats are accepted. The sequences in the two input alignments must originate from two protein subtypes with related sequences. They have to be of the same length and the positions in both alignments have to correspond. One way to produce the two input alignments is to align all sequences using an alignment program such as Muscle (Edgar 24), and then to split the resulting alignment into the two separate input alignment files. In the following, I refer to these two alignments as alignment A and alignment B, respectively. Additionally, Spial accepts a protein structure in protein database (PDB) format as input. For each position in the two input alignments A and B, Spial decides whether the residue is consensus or not. In order for an amino acid to be consensus, it has to be present above a certain threshold proportion in both alignments. This threshold can be specified by the user. In figure 2-2A, a consensus threshold value of.35 was used. Next, Spial decides whether there are amino acids that are characteristic for one of the two alignments, but not for the consensus. For this, a non-consensus amino acid has to be present above a certain threshold proportion in one of the alignments. Again, this threshold can be specified by the user. In figure 2-2A, a specificity

10 Chapter 2: Spial 2-8 threshold value of.35 was used. As long as the sum of the consensus threshold and the specificity threshold is lower than one, a position can be specific to the consensus and specific to one of the alignments at the same time. Therefore, there are eight possible combinations of specificity for alignment A, alignment B, or the consensus (figure 2-2B). I refer to these eight combinations as types. Each position in an alignment can also be one of three possible categories, which indicate whether the position is specific to the consensus (C), specific to one or both of the input alignments but not the consensus (S), or not specific at all (). The one-letter codes that specify the types and categories of each residue are located in two rows below the Spial output alignment (figure 2-2A). Spial's output consists of coloured alignments as described above, of sequence logos (Schneider and Stephens 99; Crooks, Hon et al. 24), and of coloured protein structures (figure 2-2A). The logos produced by Spial appear similar to those produced by the program Two Sample Logo (Vacic, Iakoucheva et al. 26), which treats one alignment as the background and then computes whether there are residues that are enriched or depleted in the other alignment. Spial logos differ from this by visualising how frequent a residue is in either alignment, or, if it is a consensus residue, how frequent it is in the consensus. In the output protein structures, the default colouring scheme differs from that used in the alignments. The colour of each protein residue reflects whether it is specific to alignment A, in which case its colour is red, specific to alignment B, in which case its colour is green, specific to both, in which case its colour is yellow, or specific to neither, in which case its colour is black (figure 2-3A). The proteins that are coloured in this way can be viewed either directly in the browser if a Chime ( plug-in is installed, or by loading the structure into the PyMol structure viewer ( and then running a script that is provided by Spial. Another option offered by Spial is the colouring of residues according to residue type as defined above.

11 Chapter 2: Spial 2-9 A Input alignments Input PDB structure (optional) #alignment A seq. AD-RVAT-SH seq.2 ADYKV-S-SH seq.3 ADY-VVS-SS seq.4 AEYGVIS-SS seq.5 VEYHVMT-SS # alignment B seq2. AEWTLMTPSM seq2.2 AEWTILTPSM seq2.3 AEWTMITPPS seq2.4 AEWTGFTPPS Spial Alignment Alignment A seq. A D RV A T S H seq.2 A DY KV S S H seq.3 A DY V V S S S seq.4 A E Y GV I S S S seq.5 V E Y HV M T S S Alignment B seq2. A E W T L M T P S M seq2.2 A E W T I L T P S M seq2.3 A E W T M I T P P S seq2.4 A E W T G F T P P S Types and Categories type category C CS SS CS C C Logo alignment A specific consensus residues alignment B specific Structure, with alignment specificities colour-coded (optional) alignment A specific consensus positions B type category S S S C C C C specific for... C A B explanation no specificity specific for alignment B, but not for alignment A specific for alignment A, but not for alignment B specific for alignment A and B separately specific for the consensus only specific for the consensus and alignment B specific for the consensus and alignment A specific for the consensus and alignments A and B Figure 2-2. Spial input and output. (A) Spial takes two alignments (A and B) and optionally a PDB protein structure as input. The output consists of an alignment, a logo, and optionally the structure with colour-coded residues. In the coloured alignment, each position or column is coloured according to whether the position is specific to alignment A, alignment B, the consensus, or a combination of the three. A row of numbers and letters below the alignment give further information on the specificity of that position. The logo consists of three rows that show which residues frequently occur at each position in alignment A (top row), the consensus (centre row), or alignment B (bottom row). The colours of the amino acids in the logo correspond to the properties of the amino acid. The residues of the structure are coloured according to how frequent they are in alignment A or in alignment B. (B) Each position in an alignment output by Spial can be one of eight possible types. The type of each position is determined by whether it is specific to alignment A, alignment B, the consensus, or any combination of those. Each position in an alignment can also be one of three possible categories, which indicate whether the position is specific to the consensus (C), specific to one or both of the input alignments but not the consensus (S), or not specific at all (). Examples of use Spial is a versatile tool with a number of potential applications. Scenarios in which Spial may be useful include: Of a number of homologous proteins, some bind a certain ligand or drug whilst others do not. Spial can assist in identifying surface patches that are specific to the proteins that bind the ligand.

12 Chapter 2: Spial 2- A protein has homologues in two different evolutionary lineages. Spial can assist in identifying residues that are specific to either lineage, and those that are conserved in both. Of a number of paralogues in a genome, some have a specific function whilst others do not. Spial can assist in identifying the residues that are specific to the proteins that have the function of interest. Spial can assist in identifying residues that undergo post-translational modifications by running an alignment of sequences that are commonly modified against an alignment of related sequences that are not. Some single nucleotide polymorphism (SNPs) may be specific to certain subpopulations within a species. Spial can assist in exploring the specificity of SNPs within subpopulations and mapping them to protein structures. In the following, I describe two examples for the usage of Spial: A Spial comparison of STAT5a and STAT4 allows identification of residues that are integral to the dimerisation interface of STAT5a, and a Spial comparison of cephalopod and vertebrate rhodopsins allows the identification of differences in the way the retinal moiety is co-ordinated. Example : The dimerisation interface of STAT5a I have used Spial to compare two of the seven known families of the signal transducer and activator of transcription (STAT) proteins (Aaronson and Horvath 22; Rawlings, Rosler et al. 24). I compared an alignment of STAT5a orthologues with an alignment of STAT4 orthologues from different animal species. Both alignments were obtained from the Ensembl genome database ( I then subjected the two alignments to Spial analysis and mapped the specificities I obtained onto a crystallographic structure of the unphosphorylated core STAT5a (Neculai, Neculai et al. 25). Unphosphorylated STAT5a dimerises in a way that is different to the dimerisation mode of STAT4 via Src-homology 2 (SH2) domains. By concentrating on the residues involved in intermolecular contacts between the STAT5a dimers, I was able to show that they tend to be highly conserved within STAT5a orthologues, but not conserved in STAT4 orthologues (figure 2-3B). Because residues that are located at the interface of protein-protein interactions tend to be conserved (Mintseris and Weng 25), and because in this case most of the interface residues are of type 3 (specific for the STAT5a and the STAT4 alignment separately; see figure 2-3B), the interaction

13 Chapter 2: Spial 2- between the subunits of the STAT5a dimer may be specific. For the Spial output for this example, see A frequency in alignment A B.5 specific to alignment A.5 frequency in alignment B consensus residue specific to alignment B C F2 K35 W274 M24 F29 F25 F88 Figure 2-3. How Spial maps the specificities of residues to protein structures. (A) The colour scheme used by Spial to indicate whether a residue is specific to alignment A, to alignment B, to both, or to neither. Residues that are specific to alignment A only are in red, those that are specific to alignment B only are in green, and those that are specific to both are in yellow. Residues with a low frequency in both alignments are coloured in black. (B) Cartoon representation of the interface of the STAT5a dimer with the interface running from the lower left to the upper right. Residues involved in the protein-protein interaction are represented as sticks. Hydrogen bonds are represented as yellow dashed lines. The colour scheme for the interface residues is as above, where alignment A is STAT5a and alignment B is STAT4. (C) Residues in a 4Å vicinity of retinal (in blue). The colour scheme is as above, where alignment A is cephalopod rhodopsins and alignment B is vertebrate rhodopsins. Example 2: Differences in co-ordination of retinal between vertebrate and cephalopod rhodopsin Opsins are a family of seven-helix membrane receptors that activate G proteins in a light-dependent manner via the photoisomerisation of a chromophore moiety

14 Chapter 2: Spial 2-2 embedded in the protein. Vertebrate and cephalopod rhodopsins are two subgroups of opsins. Though related, they differ in their molecular properties and function (Terakita 25). Whilst vertebrate rhodopsin activates the cyclic GMP signalling pathway, invertebrate rhodopsin activates the inositol-,4,5-triphosphate signalling pathway via a G q -type G protein (Murakami and Kouyama 28). It is not clear what the cause of the functional difference between these two rhodopsin families is. I used alignments of cephalopod and vertebrate rhodopsins as published on the GPCR database (Horn, Bettler et al. 23) as input for Spial and mapped the results to the structure of squid rhodopsin (Murakami and Kouyama 28). The result clearly shows that although some residues that co-ordinate retinal are conserved between vertebrate and cephalopod rhodopsin, this does not apply to all of them. For example Lys 35, which covalently binds retinal, is conserved between both vertebrate and cephalopod rhodopsin. Other hydrophobic residues in the retinal binding pocket, including Phe 2 and Phe 88 are specific to cephalopod rhodopsin and are not conserved in vertebrate rhodopsin (figure 2-3C). Phe 25, although it is part of the binding pocket in the squid rhodopsin structure used here, is generally not conserved in cephalopods but in vertebrates. For the Spial output for this example, see Conclusions The above examples illustrate how Spial can be used as a tool for the identification and visualisation of information about the specificity of protein residues, and how this information can be used to understand protein function, protein-small molecule and protein-protein interactions.

15 Chapter 2: Spial 2-3 References Aaronson, D. S. and C. M. Horvath (22). "A road map for those who don't know JAK-STAT." Science 296(5573): Alberts, B., A. Johnson, et al. (22). Molecular Biology of the Cell. New York, Garland Science. Armon, A., D. Graur, et al. (2). "ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information." J Mol Biol 37(): Crooks, G. E., G. Hon, et al. (24). "WebLogo: a sequence logo generator." Genome Res 4(6): Edgar, R. C. (24). "MUSCLE: multiple sequence alignment with high accuracy and high throughput." Nucleic Acids Res 32(5): Felsenstein, J. (985). "Phylogenies and the comparative method." Am Nat 25(): -5. Finn, R. D., J. Mistry, et al. (26). "Pfam: clans, web tools and services." Nucleic Acids Res 34(Database issue): D Glaser, F., T. Pupko, et al. (23). "ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information." Bioinformatics 9(): Horn, F., E. Bettler, et al. (23). "GPCRDB information system for G protein-coupled receptors." Nucleic Acids Res 3(): Landau, M., I. Mayrose, et al. (25). "ConSurf 25: the projection of evolutionary conservation scores of residues on protein structures." Nucleic Acids Res 33(Web Server issue): W Lichtarge, O., H. R. Bourne, et al. (996). "An evolutionary trace method defines binding surfaces common to protein families." J Mol Biol 257(2): Mihalek, I., I. Res, et al. (26). "Evolutionary trace report_maker: a new type of service for comparative analysis of proteins." Bioinformatics 22(3): Mintseris, J. and Z. Weng (25). "Structure, function, and evolution of transient and obligate protein-protein interactions." Proc Natl Acad Sci U S A 2(3): Murakami, M. and T. Kouyama (28). "Crystal structure of squid rhodopsin." Nature 453(793): Neculai, D., A. M. Neculai, et al. (25). "Structure of the unphosphorylated STAT5a dimer." J Biol Chem 28(49): Rawlings, J. S., K. M. Rosler, et al. (24). "The JAK/STAT signaling pathway." J Cell Sci 7(Pt 8): Schneider, T. D. and R. M. Stephens (99). "Sequence logos: a new way to display consensus sequences." Nucleic Acids Res 8(2): Schuster-Böckler, B. and A. Bateman (25). "Visualizing profile-profile alignment: pairwise HMM logos." Bioinformatics 2(2): Terakita, A. (25). "The opsins." Genome Biol 6(3): 23. Vacic, V., L. M. Iakoucheva, et al. (26). "Two Sample Logo: a graphical representation of the differences between two sets of sequence alignments." Bioinformatics 22(2): Valdar, W. S. (22). "Scoring residue conservation." Proteins 48(2):

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

Week 10: Homology Modelling (II) - HHpred

Week 10: Homology Modelling (II) - HHpred Week 10: Homology Modelling (II) - HHpred Course: Tools for Structural Biology Fabian Glaser BKU - Technion 1 2 Identify and align related structures by sequence methods is not an easy task All comparative

More information

PDBe TUTORIAL. PDBePISA (Protein Interfaces, Surfaces and Assemblies)

PDBe TUTORIAL. PDBePISA (Protein Interfaces, Surfaces and Assemblies) PDBe TUTORIAL PDBePISA (Protein Interfaces, Surfaces and Assemblies) http://pdbe.org/pisa/ This tutorial introduces the PDBePISA (PISA for short) service, which is a webbased interactive tool offered by

More information

Homology Modeling. Roberto Lins EPFL - summer semester 2005

Homology Modeling. Roberto Lins EPFL - summer semester 2005 Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,

More information

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Bioinformatics. Dept. of Computational Biology & Bioinformatics Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS

More information

Visualization of Macromolecular Structures

Visualization of Macromolecular Structures Visualization of Macromolecular Structures Present by: Qihang Li orig. author: O Donoghue, et al. Structural biology is rapidly accumulating a wealth of detailed information. Over 60,000 high-resolution

More information

Performing a Pharmacophore Search using CSD-CrossMiner

Performing a Pharmacophore Search using CSD-CrossMiner Table of Contents Introduction... 2 CSD-CrossMiner Terminology... 2 Overview of CSD-CrossMiner... 3 Searching with a Pharmacophore... 4 Performing a Pharmacophore Search using CSD-CrossMiner Version 2.0

More information

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB)

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB) Protein structure databases; visualization; and classifications 1. Introduction to Protein Data Bank (PDB) 2. Free graphic software for 3D structure visualization 3. Hierarchical classification of protein

More information

Sequence analysis and comparison

Sequence analysis and comparison The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,

More information

Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans

Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans From: ISMB-97 Proceedings. Copyright 1997, AAAI (www.aaai.org). All rights reserved. ANOLEA: A www Server to Assess Protein Structures Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans Facultés

More information

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB Homology Modeling (Comparative Structure Modeling) Aims of Structural Genomics High-throughput 3D structure determination and analysis To determine or predict the 3D structures of all the proteins encoded

More information

Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences

Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic

More information

Using Bioinformatics to Study Evolutionary Relationships Instructions

Using Bioinformatics to Study Evolutionary Relationships Instructions 3 Using Bioinformatics to Study Evolutionary Relationships Instructions Student Researcher Background: Making and Using Multiple Sequence Alignments One of the primary tasks of genetic researchers is comparing

More information

Position-specific scoring matrices (PSSM)

Position-specific scoring matrices (PSSM) Regulatory Sequence nalysis Position-specific scoring matrices (PSSM) Jacques van Helden Jacques.van-Helden@univ-amu.fr Université d ix-marseille, France Technological dvances for Genomics and Clinics

More information

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1 Supplementary Figure 1 Crystallization. a, Crystallization constructs of the ET B receptor are shown, with all of the modifications to the human wild-type the ET B receptor indicated. Residues interacting

More information

Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence

Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence Naoto Morikawa (nmorika@genocript.com) October 7, 2006. Abstract A protein is a sequence

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

Online Protein Structure Analysis with the Bio3D WebApp

Online Protein Structure Analysis with the Bio3D WebApp Online Protein Structure Analysis with the Bio3D WebApp Lars Skjærven, Shashank Jariwala & Barry J. Grant August 13, 2015 (updated November 17, 2016) Bio3D1 is an established R package for structural bioinformatics

More information

Quantifying sequence similarity

Quantifying sequence similarity Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

More information

Pairwise & Multiple sequence alignments

Pairwise & Multiple sequence alignments Pairwise & Multiple sequence alignments Urmila Kulkarni-Kale Bioinformatics Centre 411 007 urmila@bioinfo.ernet.in Basis for Sequence comparison Theory of evolution: gene sequences have evolved/derived

More information

Bioinformatics Exercises

Bioinformatics Exercises Bioinformatics Exercises AP Biology Teachers Workshop Susan Cates, Ph.D. Evolution of Species Phylogenetic Trees show the relatedness of organisms Common Ancestor (Root of the tree) 1 Rooted vs. Unrooted

More information

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years. Structure Determination and Sequence Analysis The vast majority of the experimentally determined three-dimensional protein structures have been solved by one of two methods: X-ray diffraction and Nuclear

More information

Supplementary Figure 1 Crystal packing of ClR and electron density maps. Crystal packing of type A crystal (a) and type B crystal (b).

Supplementary Figure 1 Crystal packing of ClR and electron density maps. Crystal packing of type A crystal (a) and type B crystal (b). Supplementary Figure 1 Crystal packing of ClR and electron density maps. Crystal packing of type A crystal (a) and type B crystal (b). Crystal contacts at B-C loop are magnified and stereo view of A-weighted

More information

Modelling of Possible Binding Modes of Caffeic Acid Derivatives to JAK3 Kinase

Modelling of Possible Binding Modes of Caffeic Acid Derivatives to JAK3 Kinase John von Neumann Institute for Computing Modelling of Possible Binding Modes of Caffeic Acid Derivatives to JAK3 Kinase J. Kuska, P. Setny, B. Lesyng published in From Computational Biophysics to Systems

More information

NGF - twenty years a-growing

NGF - twenty years a-growing NGF - twenty years a-growing A molecule vital to brain growth It is twenty years since the structure of nerve growth factor (NGF) was determined [ref. 1]. This molecule is more than 'quite interesting'

More information

User Guide for LeDock

User Guide for LeDock User Guide for LeDock Hongtao Zhao, PhD Email: htzhao@lephar.com Website: www.lephar.com Copyright 2017 Hongtao Zhao. All rights reserved. Introduction LeDock is flexible small-molecule docking software,

More information

Identifying Interaction Hot Spots with SuperStar

Identifying Interaction Hot Spots with SuperStar Identifying Interaction Hot Spots with SuperStar Version 1.0 November 2017 Table of Contents Identifying Interaction Hot Spots with SuperStar... 2 Case Study... 3 Introduction... 3 Generate SuperStar Maps

More information

Large-Scale Genomic Surveys

Large-Scale Genomic Surveys Bioinformatics Subtopics Fold Recognition Secondary Structure Prediction Docking & Drug Design Protein Geometry Protein Flexibility Homology Modeling Sequence Alignment Structure Classification Gene Prediction

More information

Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences

Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD William and Nancy Thompson Missouri Distinguished Professor Department

More information

ALL LECTURES IN SB Introduction

ALL LECTURES IN SB Introduction 1. Introduction 2. Molecular Architecture I 3. Molecular Architecture II 4. Molecular Simulation I 5. Molecular Simulation II 6. Bioinformatics I 7. Bioinformatics II 8. Prediction I 9. Prediction II ALL

More information

1-D Predictions. Prediction of local features: Secondary structure & surface exposure

1-D Predictions. Prediction of local features: Secondary structure & surface exposure 1-D Predictions Prediction of local features: Secondary structure & surface exposure 1 Learning Objectives After today s session you should be able to: Explain the meaning and usage of the following local

More information

Supplementary Materials for

Supplementary Materials for advances.sciencemag.org/cgi/content/full/1/8/e1500527/dc1 Supplementary Materials for A phylogenomic data-driven exploration of viral origins and evolution The PDF file includes: Arshan Nasir and Gustavo

More information

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:

More information

Genome Annotation Project Presentation

Genome Annotation Project Presentation Halogeometricum borinquense Genome Annotation Project Presentation Loci Hbor_05620 & Hbor_05470 Presented by: Mohammad Reza Najaf Tomaraei Hbor_05620 Basic Information DNA Coordinates: 527,512 528,261

More information

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

CISC 636 Computational Biology & Bioinformatics (Fall 2016) CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein

More information

Structure to Function. Molecular Bioinformatics, X3, 2006

Structure to Function. Molecular Bioinformatics, X3, 2006 Structure to Function Molecular Bioinformatics, X3, 2006 Structural GeNOMICS Structural Genomics project aims at determination of 3D structures of all proteins: - organize known proteins into families

More information

EBI web resources II: Ensembl and InterPro. Yanbin Yin Spring 2013

EBI web resources II: Ensembl and InterPro. Yanbin Yin Spring 2013 EBI web resources II: Ensembl and InterPro Yanbin Yin Spring 2013 1 Outline Intro to genome annotation Protein family/domain databases InterPro, Pfam, Superfamily etc. Genome browser Ensembl Hands on Practice

More information

An Introduction to Sequence Similarity ( Homology ) Searching

An Introduction to Sequence Similarity ( Homology ) Searching An Introduction to Sequence Similarity ( Homology ) Searching Gary D. Stormo 1 UNIT 3.1 1 Washington University, School of Medicine, St. Louis, Missouri ABSTRACT Homologous sequences usually have the same,

More information

Motif Prediction in Amino Acid Interaction Networks

Motif Prediction in Amino Acid Interaction Networks Motif Prediction in Amino Acid Interaction Networks Omar GACI and Stefan BALEV Abstract In this paper we represent a protein as a graph where the vertices are amino acids and the edges are interactions

More information

CHEM 463: Advanced Inorganic Chemistry Modeling Metalloproteins for Structural Analysis

CHEM 463: Advanced Inorganic Chemistry Modeling Metalloproteins for Structural Analysis CHEM 463: Advanced Inorganic Chemistry Modeling Metalloproteins for Structural Analysis Purpose: The purpose of this laboratory is to introduce some of the basic visualization and modeling tools for viewing

More information

Patrick: An Introduction to Medicinal Chemistry 5e Chapter 04

Patrick: An Introduction to Medicinal Chemistry 5e Chapter 04 01) Which of the following statements is not true about receptors? a. Most receptors are proteins situated inside the cell. b. Receptors contain a hollow or cleft on their surface which is known as a binding

More information

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE Examples of Protein Modeling Protein Modeling Visualization Examination of an experimental structure to gain insight about a research question Dynamics To examine the dynamics of protein structures To

More information

Pymol Practial Guide

Pymol Practial Guide Pymol Practial Guide Pymol is a powerful visualizor very convenient to work with protein molecules. Its interface may seem complex at first, but you will see that with a little practice is simple and powerful

More information

Comparing whole genomes

Comparing whole genomes BioNumerics Tutorial: Comparing whole genomes 1 Aim The Chromosome Comparison window in BioNumerics has been designed for large-scale comparison of sequences of unlimited length. In this tutorial you will

More information

Pairwise sequence alignment

Pairwise sequence alignment Department of Evolutionary Biology Example Alignment between very similar human alpha- and beta globins: GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKL G+ +VK+HGKKV A+++++AH+D++ +++++LS+LH KL GNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKL

More information

Sequence Alignment Techniques and Their Uses

Sequence Alignment Techniques and Their Uses Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this

More information

EBI web resources II: Ensembl and InterPro

EBI web resources II: Ensembl and InterPro EBI web resources II: Ensembl and InterPro Yanbin Yin http://www.ebi.ac.uk/training/online/course/ 1 Homework 3 Go to http://www.ebi.ac.uk/interpro/training.htmland finish the second online training course

More information

In-Depth Assessment of Local Sequence Alignment

In-Depth Assessment of Local Sequence Alignment 2012 International Conference on Environment Science and Engieering IPCBEE vol.3 2(2012) (2012)IACSIT Press, Singapoore In-Depth Assessment of Local Sequence Alignment Atoosa Ghahremani and Mahmood A.

More information

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27 Acta Cryst. (2014). D70, doi:10.1107/s1399004714021695 Supporting information Volume 70 (2014) Supporting information for article: Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

PGA: A Program for Genome Annotation by Comparative Analysis of. Maximum Likelihood Phylogenies of Genes and Species

PGA: A Program for Genome Annotation by Comparative Analysis of. Maximum Likelihood Phylogenies of Genes and Species PGA: A Program for Genome Annotation by Comparative Analysis of Maximum Likelihood Phylogenies of Genes and Species Paulo Bandiera-Paiva 1 and Marcelo R.S. Briones 2 1 Departmento de Informática em Saúde

More information

Hands-on Course in Computational Structural Biology and Molecular Simulation BIOP590C/MCB590C. Course Details

Hands-on Course in Computational Structural Biology and Molecular Simulation BIOP590C/MCB590C. Course Details Hands-on Course in Computational Structural Biology and Molecular Simulation BIOP590C/MCB590C Emad Tajkhorshid Center for Computational Biology and Biophysics Email: emad@life.uiuc.edu or tajkhors@uiuc.edu

More information

Subfamily HMMS in Functional Genomics. D. Brown, N. Krishnamurthy, J.M. Dale, W. Christopher, and K. Sjölander

Subfamily HMMS in Functional Genomics. D. Brown, N. Krishnamurthy, J.M. Dale, W. Christopher, and K. Sjölander Subfamily HMMS in Functional Genomics D. Brown, N. Krishnamurthy, J.M. Dale, W. Christopher, and K. Sjölander Pacific Symposium on Biocomputing 10:322-333(2005) SUBFAMILY HMMS IN FUNCTIONAL GENOMICS DUNCAN

More information

Molecular Visualization. Introduction

Molecular Visualization. Introduction Molecular Visualization Jeffry D. Madura Department of Chemistry & Biochemistry Center for Computational Sciences Duquesne University Introduction Assessments of change, dynamics, and cause and effect

More information

Detection of Protein Binding Sites II

Detection of Protein Binding Sites II Detection of Protein Binding Sites II Goal: Given a protein structure, predict where a ligand might bind Thomas Funkhouser Princeton University CS597A, Fall 2007 1hld Geometric, chemical, evolutionary

More information

Introduction to Structure Preparation and Visualization

Introduction to Structure Preparation and Visualization Introduction to Structure Preparation and Visualization Created with: Release 2018-4 Prerequisites: Release 2018-2 or higher Access to the internet Categories: Molecular Visualization, Structure-Based

More information

Prediction and Classif ication of Human G-protein Coupled Receptors Based on Support Vector Machines

Prediction and Classif ication of Human G-protein Coupled Receptors Based on Support Vector Machines Article Prediction and Classif ication of Human G-protein Coupled Receptors Based on Support Vector Machines Yun-Fei Wang, Huan Chen, and Yan-Hong Zhou* Hubei Bioinformatics and Molecular Imaging Key Laboratory,

More information

Basics on bioinforma-cs Lecture 7. Nunzio D Agostino

Basics on bioinforma-cs Lecture 7. Nunzio D Agostino Basics on bioinforma-cs Lecture 7 Nunzio D Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Multiple alignments One sequence plays coy a pair of homologous sequence whisper many aligned

More information

Computational Structural Biology and Molecular Simulation. Introduction to VMD Molecular Visualization and Analysis

Computational Structural Biology and Molecular Simulation. Introduction to VMD Molecular Visualization and Analysis Computational Structural Biology and Molecular Simulation Introduction to VMD Molecular Visualization and Analysis Emad Tajkhorshid Department of Biochemistry, Beckman Institute, Center for Computational

More information

A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries

A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries Betty Yee Man Cheng 1, Jaime G. Carbonell 1, and Judith Klein-Seetharaman 1, 2 1 Language Technologies

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Figure S1. Secondary structure of CAP (in the camp 2 -bound state) 10. α-helices are shown as cylinders and β- strands as arrows. Labeling of secondary structure is indicated. CDB, DBD and the hinge are

More information

Protein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.

Protein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki. Protein Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet rickard.sandberg@ki.se sandberg.cmb.ki.se Outline Protein features motifs patterns profiles signals 2 Protein

More information

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family Jieming Shen 1,2 and Hugh B. Nicholas, Jr. 3 1 Bioengineering and Bioinformatics Summer

More information

Algorithms in Computational Biology (236522) spring 2008 Lecture #1

Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: 15:30-16:30/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office hours:??

More information

Protein Structure and Visualisation. Introduction to PDB and PyMOL

Protein Structure and Visualisation. Introduction to PDB and PyMOL Protein Structure and Visualisation Introduction to PDB and PyMOL 1 Feedback Persons http://www.bio-evaluering.dk/ 2 Program 8.00-8.15 Quiz results 8.15-8.50 Introduction to PDB & PyMOL 8.50-9.00 Break

More information

tconcoord-gui: Visually Supported Conformational Sampling of Bioactive Molecules

tconcoord-gui: Visually Supported Conformational Sampling of Bioactive Molecules Software News and Updates tconcoord-gui: Visually Supported Conformational Sampling of Bioactive Molecules DANIEL SEELIGER, BERT L. DE GROOT Computational Biomolecular Dynamics Group, Max-Planck-Institute

More information

SAM Teacher s Guide Protein Partnering and Function

SAM Teacher s Guide Protein Partnering and Function SAM Teacher s Guide Protein Partnering and Function Overview Students explore protein molecules physical and chemical characteristics and learn that these unique characteristics enable other molecules

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Basic Local Alignment Search Tool

Basic Local Alignment Search Tool Basic Local Alignment Search Tool Alignments used to uncover homologies between sequences combined with phylogenetic studies o can determine orthologous and paralogous relationships Local Alignment uses

More information

Introduction to Bioinformatics Online Course: IBT

Introduction to Bioinformatics Online Course: IBT Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec1 Building a Multiple Sequence Alignment Learning Outcomes 1- Understanding Why multiple

More information

RNA Polymerase I Contains a TFIIF-Related DNA-Binding Subcomplex

RNA Polymerase I Contains a TFIIF-Related DNA-Binding Subcomplex Molecular Cell, Volume 39 Supplemental Information RNA Polymerase I Contains a TFIIFRelated DNABinding Subcomplex Sebastian R. Geiger, Kristina Lorenzen, Amelie Schreieck, Patrizia Hanecker, Dirk Kostrewa,

More information

PHYLOGENY AND SYSTEMATICS

PHYLOGENY AND SYSTEMATICS AP BIOLOGY EVOLUTION/HEREDITY UNIT Unit 1 Part 11 Chapter 26 Activity #15 NAME DATE PERIOD PHYLOGENY AND SYSTEMATICS PHYLOGENY Evolutionary history of species or group of related species SYSTEMATICS Study

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES HOW CAN BIOINFORMATICS BE USED AS A TOOL TO DETERMINE EVOLUTIONARY RELATIONSHPS AND TO BETTER UNDERSTAND PROTEIN HERITAGE?

More information

Grundlagen der Bioinformatik Summer semester Lecturer: Prof. Daniel Huson

Grundlagen der Bioinformatik Summer semester Lecturer: Prof. Daniel Huson Grundlagen der Bioinformatik, SS 10, D. Huson, April 12, 2010 1 1 Introduction Grundlagen der Bioinformatik Summer semester 2010 Lecturer: Prof. Daniel Huson Office hours: Thursdays 17-18h (Sand 14, C310a)

More information

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner Table of Contents Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner Introduction... 2 CSD-CrossMiner Terminology... 2 Overview of CSD-CrossMiner... 3 Features

More information

Analysis of correlated mutations in Ras G-domain

Analysis of correlated mutations in Ras G-domain www.bioinformation.net Volume 13(6) Hypothesis Analysis of correlated mutations in Ras G-domain Ekta Pathak * Bioinformatics Department, MMV, Banaras Hindu University. Ekta Pathak - E-mail: ektavpathak@gmail.com;

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Icm/Dot secretion system region I in 41 Legionella species.

Nature Genetics: doi: /ng Supplementary Figure 1. Icm/Dot secretion system region I in 41 Legionella species. Supplementary Figure 1 Icm/Dot secretion system region I in 41 Legionella species. Homologs of the effector-coding gene lega15 (orange) were found within Icm/Dot region I in 13 Legionella species. In four

More information

CSCE555 Bioinformatics. Protein Function Annotation

CSCE555 Bioinformatics. Protein Function Annotation CSCE555 Bioinformatics Protein Function Annotation Why we need to do function annotation? Fig from: Network-based prediction of protein function. Molecular Systems Biology 3:88. 2007 What s function? The

More information

Effects of Gap Open and Gap Extension Penalties

Effects of Gap Open and Gap Extension Penalties Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See

More information

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Comparative genomics and proteomics Species available Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Vertebrates: human, chimpanzee, mouse, rat,

More information

CAP 5510 Lecture 3 Protein Structures

CAP 5510 Lecture 3 Protein Structures CAP 5510 Lecture 3 Protein Structures Su-Shing Chen Bioinformatics CISE 8/19/2005 Su-Shing Chen, CISE 1 Protein Conformation 8/19/2005 Su-Shing Chen, CISE 2 Protein Conformational Structures Hydrophobicity

More information

Gürol M. Süel, Steve W. Lockless, Mark A. Wall, and Rama Ra

Gürol M. Süel, Steve W. Lockless, Mark A. Wall, and Rama Ra Gürol M. Süel, Steve W. Lockless, Mark A. Wall, and Rama Ranganathan, Evolutionarily conserved networks of residues mediate allosteric communication in proteins, Nature Structural Biology, vol. 10, no.

More information

Protein Structure: Data Bases and Classification Ingo Ruczinski

Protein Structure: Data Bases and Classification Ingo Ruczinski Protein Structure: Data Bases and Classification Ingo Ruczinski Department of Biostatistics, Johns Hopkins University Reference Bourne and Weissig Structural Bioinformatics Wiley, 2003 More References

More information

SDR: A Database of Predicted Specificity- Determining Residues in Proteins

SDR: A Database of Predicted Specificity- Determining Residues in Proteins SDR: A Database of Predicted Specificity- Determining Residues in Proteins The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation

More information

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Introduction to Comparative Protein Modeling. Chapter 4 Part I Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature

More information

Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)

Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program) Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program) Course Name: Structural Bioinformatics Course Description: Instructor: This course introduces fundamental concepts and methods for structural

More information

Sequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5

Sequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5 Sequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5 Why Look at More Than One Sequence? 1. Multiple Sequence Alignment shows patterns of conservation 2. What and how many

More information

Last updated: Copyright

Last updated: Copyright Last updated: 2012-08-20 Copyright 2004-2012 plabel (v2.4) User s Manual by Bioinformatics Group, Institute of Computing Technology, Chinese Academy of Sciences Tel: 86-10-62601016 Email: zhangkun01@ict.ac.cn,

More information

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9 Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic

More information

Supporting Text 1. Comparison of GRoSS sequence alignment to HMM-HMM and GPCRDB

Supporting Text 1. Comparison of GRoSS sequence alignment to HMM-HMM and GPCRDB Structure-Based Sequence Alignment of the Transmembrane Domains of All Human GPCRs: Phylogenetic, Structural and Functional Implications, Cvicek et al. Supporting Text 1 Here we compare the GRoSS alignment

More information

Building a Homology Model of the Transmembrane Domain of the Human Glycine α-1 Receptor

Building a Homology Model of the Transmembrane Domain of the Human Glycine α-1 Receptor Building a Homology Model of the Transmembrane Domain of the Human Glycine α-1 Receptor Presented by Stephanie Lee Research Mentor: Dr. Rob Coalson Glycine Alpha 1 Receptor (GlyRa1) Member of the superfamily

More information

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega BLAST Multiple Sequence Alignments: Clustal Omega What does basic BLAST do (e.g. what is input sequence and how does BLAST look for matches?) Susan Parrish McDaniel College Multiple Sequence Alignments

More information

Chapter 5. Proteomics and the analysis of protein sequence Ⅱ

Chapter 5. Proteomics and the analysis of protein sequence Ⅱ Proteomics Chapter 5. Proteomics and the analysis of protein sequence Ⅱ 1 Pairwise similarity searching (1) Figure 5.5: manual alignment One of the amino acids in the top sequence has no equivalent and

More information

Sequence Analysis, '18 -- lecture 9. Families and superfamilies. Sequence weights. Profiles. Logos. Building a representative model for a gene.

Sequence Analysis, '18 -- lecture 9. Families and superfamilies. Sequence weights. Profiles. Logos. Building a representative model for a gene. Sequence Analysis, '18 -- lecture 9 Families and superfamilies. Sequence weights. Profiles. Logos. Building a representative model for a gene. How can I represent thousands of homolog sequences in a compact

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Computational methods for predicting protein-protein interactions

Computational methods for predicting protein-protein interactions Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational

More information

Graph Alignment and Biological Networks

Graph Alignment and Biological Networks Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale

More information