Introduction to Bioinformatics. Case Study

Similar documents
Introduction to Bioinformatics Introduction to Bioinformatics

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Building a Homology Model of the Transmembrane Domain of the Human Glycine α-1 Receptor

Week 10: Homology Modelling (II) - HHpred

Structure-Function Relationship of Cytoplasmic and Nuclear IkB Proteins: An In Silico Analysis

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Bioinformatics. Dept. of Computational Biology & Bioinformatics

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

Basics of protein structure

CAP 5510 Lecture 3 Protein Structures

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

Homology models of the tetramerization domain of six eukaryotic voltage-gated potassium channels Kv1.1-Kv1.6

We used the PSI-BLAST program ( to search the

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans

SUPPLEMENTARY INFORMATION

Ch. 9 Multiple Sequence Alignment (MSA)

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

Biol403 - Receptor Serine/Threonine Kinases

NGF - twenty years a-growing

Grundlagen der Bioinformatik Summer semester Lecturer: Prof. Daniel Huson

Comparing whole genomes

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Structure Prediction and Display

Supplementary Figure 1. Aligned sequences of yeast IDH1 (top) and IDH2 (bottom) with isocitrate

SUPPLEMENTARY INFORMATION

5- Semaphorin-Plexin-Neuropilin

Measuring quaternary structure similarity using global versus local measures.

Preparing a PDB File

Homology Modeling. Roberto Lins EPFL - summer semester 2005

CS612 - Algorithms in Bioinformatics

Pymol Practial Guide

Modeling for 3D structure prediction

Review. Membrane proteins. Membrane transport

Structure of the α-helix

Supplementary Figure 1 Schematic overview of ASTNs in neuronal migration. (a) Schematic of roles played by ASTNs 1 and 2. ASTN-1-mediated adhesions

GCD3033:Cell Biology. Transcription

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Visualization of Macromolecular Structures

EBI web resources II: Ensembl and InterPro. Yanbin Yin Spring 2013

Introduction Molecular Structure Script Console External resources Advanced topics. JMol tutorial. Giovanni Morelli.

SUPPLEMENTARY INFORMATION

Nature Structural and Molecular Biology: doi: /nsmb.2938

Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution

DATA ACQUISITION FROM BIO-DATABASES AND BLAST. Natapol Pornputtapong 18 January 2018

Transmembrane Domains (TMDs) of ABC transporters

Computational modeling of G-Protein Coupled Receptors (GPCRs) has recently become

Sequence analysis and comparison

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha

BIOINFORMATICS LAB AP BIOLOGY

7.06 Cell Biology EXAM #3 April 21, 2005

Cross Discipline Analysis made possible with Data Pipelining. J.R. Tozer SciTegic

Gene regulation I Biochemistry 302. Bob Kelm February 25, 2005

RANK. Alternative names. Discovery. Structure. William J. Boyle* SUMMARY BACKGROUND

1-D Predictions. Prediction of local features: Secondary structure & surface exposure

Molecular Modeling Lecture 7. Homology modeling insertions/deletions manual realignment

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Initiation of translation in eukaryotic cells:connecting the head and tail

Chapter 5. Proteomics and the analysis of protein sequence Ⅱ

SUPPLEMENTARY INFORMATION

Prediction and Classif ication of Human G-protein Coupled Receptors Based on Support Vector Machines

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

EBI web resources II: Ensembl and InterPro

Genomics and bioinformatics summary. Finding genes -- computer searches

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

Can protein model accuracy be. identified? NO! CBS, BioCentrum, Morten Nielsen, DTU

Online Protein Structure Analysis with the Bio3D WebApp

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Activation of a receptor. Assembly of the complex

AP Biology Gene Regulation and Development Review

Transcription Regulation And Gene Expression in Eukaryotes UPSTREAM TRANSCRIPTION FACTORS

Analysis and Prediction of Protein Structure (I)

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB)

RNA Synthesis and Processing

Supporting Information

User Guide for LeDock

Zool 3200: Cell Biology Exam 5 4/27/15

Molecular Cell Biology 5068 In Class Exam 2 November 8, 2016

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions

Bahnson Biochemistry Cume, April 8, 2006 The Structural Biology of Signal Transduction

ALL LECTURES IN SB Introduction

BA, BSc, and MSc Degree Examinations

Lipniacki 2004 Ground Truth

Supporting Online Material for

The majority of cells in the nervous system arise during the embryonic and early post

Goals. Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions

SUPPLEMENTARY FIGURES. Structure of the cholera toxin secretion channel in its. closed state

Minireview: Molecular Structure and Dynamics of Drug Targets

CSCE555 Bioinformatics. Protein Function Annotation

Table 1. Crystallographic data collection, phasing and refinement statistics. Native Hg soaked Mn soaked 1 Mn soaked 2

Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment

In-Depth Assessment of Local Sequence Alignment

RNA and Protein Structure Prediction

β1 Structure Prediction and Validation

CHAPTER 1 THE STRUCTURAL BIOLOGY OF THE FGF19 SUBFAMILY

ICM-Chemist-Pro How-To Guide. Version 3.6-1h Last Updated 12/29/2009

Assignment A02: Geometry Definition: File Formats, Redundant Coordinates, PES Scans

Genome Annotation Project Presentation

Supporting Information

Transcription:

Case Study

Case 1 Case 2 How SIGIRR inhibit the TLR4 and 7 signaling pathways? Homology modeling of Tolllike receptor ectodomains.

Case 1 How SIGIRR inhibit the Toll-like receptors TLR4 and 7 signaling pathways?

Background : Structure of Toll-like receptors (TLRs) Ectodomain (ECD) Leucine-rich repeat (LRR) Transmembrane domain TIR domain TLRs belong to the Toll-like receptor/ interleukin-1 receptor (TLR/IL-1R) superfamily, which is defined by the presence of a conserved cytoplasmic Toll/interleukin-1 receptor (TIR) domain connected to an ectodomain through a single transmembrane stretch. Their ectodomains consist of 16 28 leucinerich repeats (LRRs).

TLR signaling pathways These LRRs provide a variety of structural frameworks for the binding of protein and non-protein ligands including lipopolysaccharide (LPS), lipopeptide, CpG DNA, flagellin, and double- /single-stranded RNA.

TLRs are capable of recognizing ligands in a dimer form. Determined crystal structures of TLR ECDligand-ECD complexes: human TLR2-1, mouse TLR3-3, human TLR4-4, mouse TLR2-6.

TIR DD Upon receptor activation, an intracellular TIR signaling complex is formed between the receptor and downstream adaptor TIR domains. MyD88 (Myeloid differentiation primary response protein 88) was the first intracellular adaptor molecule characterized among all known adaptors in the TLR signaling. It consists of an N-terminal death domain (DD) separated from its C-terminal TIR domain by a linker sequence. MyD88 also forms a dimer through DD-DD and TIR-TIR domain interactions when recruited to the receptor complex. MyD88 can recruit IRAK (IL-1RI-associated protein kinases) through its DD to continue signaling and, finally, to induce the nuclear factor-kb (NF-kB) leading to the expression of type I interferons.

Leucine-rich repeats (LRRs) TLR SIGIRR (single immunoglobulin interleukin-1 receptorrelated molecule) Single immunoglobulin (Ig) Toll/interleukin-1 receptor (TIR) domain TIR domain 73 AA C-terminal tail SIGIRR (Single immunoglobulin interleukin-1 receptor-related molecule), also known as TIR8, was initially identified as an Ig domain-containing receptor of the TLR/IL-1R superfamily. Both the extracellular and intracellular domains of SIGIRR differ from those of other Ig domain-containing receptors, as its single extracellular Ig domain does not support ligand-binding. Its intracellular TIR domain cannot activate NF-kB. Moreover, the TIR domain of SIGIRR extends that of the typical TLR/IL-1R superfamily member by >73 amino acids at the C- terminal (C-tail).

mouse B6 lpr/lpr Sigirr +/+ mouse B6 lpr/lpr Sigirr -/- Lech et al., JEM, 2008 Instead, SIGIRR acts as an endogenous inhibitor for MyD88-dependent TLR and IL-1R signaling. This behavior was shown by over expression of SIGIRR in Jurkat or HepG2 cells which showed substantially reduced LPS, CpG DNA or IL-1-induced activation of NFkB. Thus, SIGIRR has attracted tremendous research interest because of its regulating function in cancerrelated inflammation and autoimmunity. For example, systemic lupus erythematosus (SLE, 系统性红斑狼疮 ) is caused by TLR7-mediated induction of type I interferons. Compared with wild type mice Sigirr-deficient mice develop excessive lymphoproliferation when introduced into the context of a lupus susceptibility gene. Although the significance of SIGIRR has been widely acknowledged, its inhibition mechanism remains unclear owing to a lack of structural information.

Mutagenesis studies investigated three deletion mutants of SIGIRR: ΔN (lacking the extracellular Ig domain), ΔTIR (lacking the intracellular TIR domain) and ΔC (lacking the C-tail of the TIR domain with deletion of residues 313 410). bind to TLR4 inhibit signaling ΔN yes yes ΔC yes yes ΔTIR no no Fulllength yes Qin et al., 2005 JBC yes The results showed that only the TIR domain (excluding the C-tail part) is necessary for SIGIRR to inhibit TLR4 signaling. Nevertheless, detailed structural interaction mechanisms of SIGIRR s TIR domain are still missing.

Hypothesis: SIGIRR blocks the molecular interface of TLR4 and MyD88 via its TIR domain Objective: to find a structural explanation for these TIR-TIR interactions. 1. Structure prediction of TIR domains of TLRs, MyD88 and SIGIRR. 2. Structure analysis/docking.

Step 1 : model construction Amino acid sequences of the target proteins, human TLR4, TLR7, MyD88, and SIGIRR were extracted from the NCBI protein database. Three-dimensional models of TLR4, TLR7, MyD88 and SIGIRR (without the C-tail) were constructed by homology modeling. Due to the homology of the target proteins, four common templates were obtained via BLAST search against the Protein Data Bank (PDB). They were TLR1 (1FYV), TLR2 (1FYW), TLR10 (2J67) and IL-1RAPL (1T3G). In the secondary structure-aided alignments for the homology modeling, the average target-template sequence similarity of TLR4, TLR7, MyD88 and SIGIRR was 51.7%, 50.4%, 44.5% and 42.7%, respectively Multiple sequence alignment of each target with the templates was generated with MUSCLE and analyzed with Jalview. Because the secondary structure of the TIR domain is composed of well-organized alternating β-strands and α-helixes, the alignments were adjusted manually according to the secondary structure information to improve the alignment quality. The secondary structure of each target was predicted by PSIPRED.

Step 1 : model construction The resulting structures exhibit a typical TIR domain conformation in which a central five-stranded parallel β-sheet (βa- βe) is surrounded by a total of five α-helixes (αa αe) on both sides. The loops are named by the letters of the secondary structure elements that they connect. For example, the BBloop connects β-strand B and α-helix B. The structure of NSF-N was identified as a template for SIGIRR s C-tail through protein threading. To improve the model quality, ModLoop was used to rebuild the coordinates of the low quality loop regions. Finally, model quality assessment programs: ProQ, ModFOLD and MetaMQAP were used to evaluate the output candidate models and select the most reliable one. crystal structure of IL1-RAPL (1T3G)

Step 1 : model construction The BB-loop and αe of TLR4, TLR7 and MyD88, along with the BB-loop of SIGIRR, may be important to ensure binding specificity achieved by different combinations of TIRs during signaling.

Step 1 : model construction Surface charge distribution (APBS electrostatics) of BB-loop and αe were represented with red indicating areas of negative charge and blue indicating positive charge. Accordingly, all BB-loops can be divided into two self-complementary parts. The N-terminal (upper region of BB-loops) is negatively charged, whereas the C-terminal (lower region of BB-loops) is positively charged. The αes, by contrast, are predominantly positive.

Step 2 : protein-protein docking Unrestrained pairwise model docking included eight complexes of TIR domains: TLR4-TLR4, TLR7-TLR7, MyD88-MyD88, TLR4 dimer-myd88 dimer (tetramer), TLR7 dimer-myd88 dimer (tetramer), TLR4-SIGIRR, TLR7-SIGIRR and MyD88-SIGIRR. We used GRAMM-X and ZDOCK, which are widely accepted rigid-body protein-protein docking programs, to predict and assess the interactions between these complexes. The buried surface interaction area of dimer models were calculated with the protein interfaces, surfaces and assemblies service (PISA) at the European Bioinformatics Institute (EBI).

Step 3 : hypothesis model construction From a large number of docking results we established such a model of SIGIRR inhibiting the TLR7 signaling pathways.

Step 3 : hypothesis model construction From a large number of docking results and we established such a model of SIGIRR inhibiting the TLR7 signaling pathways.

Step 3 : hypothesis model construction From a large number of docking results and we established such a model of SIGIRR inhibiting the TLR7 signaling pathways. Lech et al., 2010 J. Pathol.

Step 3 : hypothesis model construction From a large number of docking results and we established such a model of SIGIRR inhibiting the TLR4 signaling pathways.

Step 4 : Conclusion In summary, we propose a residue-detailed structural framework of SIGIRR inhibiting the TLR4 and 7 signaling pathways. These results were obtained by computer modeling and are expected to facilitate efforts to design further site-directed mutagenesis experiments to clarity the regulatory role of SIGIRR in inflammatory and innate immune responses. Inhibition of the Toll-like receptors TLR4 and 7 signaling pathways by SIGIRR: a computational approach J. Struct. Biol., 2010, 169:323-330 IF: 4.06, SCI citation times: 5 Jing Gong, Tiandi Wei, Robert W. Stark, Ferdinand Jamitzky, Wolfgang M. Heckl, Hans-Joachim Anders, Maciej Lech and Shaila C. Röessle.

Case 2 Homology modeling of Toll-like receptor ectodomains

TLR sequences So far, there are about 3000 protein sequences of different TLRs from different species saved in primary protein databases. The number will continue growing.

Background : Structure of Toll-like receptors (TLRs) Ectodomain (ECD) Leucine-rich repeat (LRR) Transmembrane domain TIR domain TLRs belong to the Toll-like receptor/ interleukin-1 receptor (TLR/IL-1R) superfamily, which is defined by the presence of a conserved cytoplasmic Toll/interleukin-1 receptor (TIR) domain connected to an ectodomain through a single transmembrane stretch. Their ectodomains consist of 16 28 leucinerich repeats (LRRs).

LRR identification 22 LRR + 1 CT 6 LRR + 2 N/CT 6 LRR + 1 CT LRR identification ECD of human TLR3 23 LRRs + 2 N/CT LRRs 17 LRR + 2 N/CT 22 LRR

LRR identification LxxLxLxxNxLxxLxxxxFxxLxx PTNITVLNLTHNQLRRLPAANFTR PTNITVLNLTHNQLRRLPAANFTR NITVLNLTHNQLRRLPAANFTRY PTNITVLNLTHNQLRRLPAA NITVLNLTHNQLRRLPAANFTRY

TollML database Structural Motifs (3 Levels) Domains of each TLR Signal Peptide (SP) Ectodomain (ECD) Transmembrane Domain (TD) TIR Domain LRRs of each ECD 2734 sequences, 2011/08/01 Segments of each LRR Highly Conserved Segment (HCS) Variable Segment (VS) Inserted Segment (IS)

Construction pipeline

Domains LRRs Segments

LRR Finder main algorithm : a position-specific weight matrix of LRR motifs Position Amino acids Example: LPTNLTVLMLLHNQLRRLPAANFTRYSQLTSLDVGFNT 3.800 2.232 1.054 % cutoff Yes No

Example: LPTNLTVLMLLHNQLRRLPAANFTRYSQLTSLDVGFNT 3.800 2.232 1.054 Yes No No filter Sensitivity / Specificity Cutoff score Cutoff 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 Sensitivity 0.942 0.933 0.924 0.916 0.907 0.886 0.868 0.858 0.842 0.822 0.805 Specificity 0.852 0.882 0.902 0.916 0.935 0.954 0.970 0.981 0.988 0.992 0.994 Spe. (filter) 0.914 0.930 0.953 0.959 0.972 0.981 0.987 0.991 0.994 0.996 0.997

This database is freely available at http://tollml.lrz.de. Any internet user can search and download data from the database, but only registered users can define and save labels for arbitrary entries. TollML: a database of toll-like receptor strutural motifs J. Mol. Model., 2010, 16(7):1283-1289 IF: 2.34, SCI citation times: 3 Jing Gong, Tiandi Wei, Ning Zhang, Ferdinand Jamitzky, Wolfgang M. Heckl, Shaila C. Rössle and Robert W. Stark

2010/11

Construction pipeline

Every LRR structure can be viewed with an online molecular viewer Jmol.

To simplify the homology modeling, the similarity search was implemented. It returns the structures of the most similar LRRs for a structure unknown LRR. At first, a global pairwise sequence alignment with sequence identity will be generated for the target LRR and each of the LRRs in the user selected set. Then, the most similar LRRs will be returned as template candidates, ranked by sequence identity.

LRRML contains individual three-dimensional LRR structures with manual structural annotations. It presents useful sources for homology modeling and structural analysis of LRR proteins. This database is freely available at http://tollml.lrz.de. LRRML: a conformational database and an XML description of leucine-rich repeats (LRRs) BMC Struct. Biol., 2008, 8:47 IF: 3.06, SCI citation times: 3 Tiandi Wei, Jing Gong*, Ferdinand Jamitzky, Wolfgang M. Heckl, Robert W. Stark and Shaila C. Rössle *corresponding author

In mammalian, 13 TLRs have been identified. Protein sequences are available for a number of mammalian species. Using these sequences, a complete molecular phylogenetic analysis and a phylogenetic tree of the known TLRs were reported. According to this tree, mammalian TLRs can be divided into six subfamilies. TLR1, 2, 6 and 10 belong to the TLR1 subfamily. TLR3 constitutes the TLR3 subfamily. TLR4 constitutes the TLR4 subfamily and TLR5 constitutes the TLR5 subfamily. TLR7, 8 and 9 compose the TLR7 subfamily. TLR11, 12 and 13 belong to the TLR11 subfamily.

Since 2000 the crystal structure of human TLR3 ECD was firstly reported, four crystal structures of receptorligand complexes have been determined. They are : human TLR2-1 heterodimer, mouse TLR3 homodimer, human TLR4 homodimer, mouse TLR2-6 heterodimer.

TLR sequences ~3000 known TLR sequences Compared with the small number of crystal structures, there are about 3000 known protein sequences of different TLRs from different species. Because the X-ray crystallography remains timeconsuming and sometimes it is very difficult to crystallize proteins, computational methods can perform fast and large-scale structural predictions based on the sequences. Currently, the most accurate protein structure prediction method is homology modeling.

When applying the homology modeling on the TLR ectodomains, we encountered a problem. The sequence identity between the target and the full-length template(s), namely the aforementioned crystal structures, is much lower than 30% because of diverse numbers and arrangements of LRRs contained in the TLR ectodomains. This problem is also described by the phylogenetic tree. Thus we could not get a proper model. To solve this problem we developed an LRR template assembly approach with the help of both TollML and LRRML databases.

Flowchart of the LRR template assembly approach

Threading method Crystal structure Full-length templates LRR assembly TLR3 ECD

Superimposition of the model (blue) and crystal structure (orange) of TLR3 at the two ligand interaction regions. Global root mean square deviation: 1.96 Å and 1.90 Å.

If the root mean square deviation between a model and a structure is < 3 Å, the model is very good and can be used to perform liganddocking and molecular replacement. Zhang et al., 2009.

Average target-template sequence identity >= 45%

Superimposition of the model (green) and crystal structure (orange) of TLR6. Global root mean square deviation: 1.94 Å; ligand-binding region: 1.18 Å.

These models can be used to perform ligand-docking studies or to design mutagenesis experiments to investigate TLR ligand-binding mechanisms, and thus help to develop new TLR agonists and antagonists that have therapeutic significance for infectious diseases. A leucine-rich repeat assembly approach for homology modeling of human TLR5-10 and mouse TLR11-13 ectodomains. J. Mol. Model., 2011, 17(1):27-36 IF: 2.34, SCI citation times: 3 Tiandi Wei, Jing Gong*, Ferdinand Jamitzky, Wolfgang M. Heckl, Shaila C. Rössle and Robert W. Stark *corresponding author

Exam Thesis

Exam Thesis Topic : What can bioinformatics do for you? Language : English Word count : 1000-2000 Deadline : 2011/11/30 Submit to : gongjing@sdu.edu.cn

Format : 1. The following word processor file formats are acceptable for the thesis: Microsoft Word (.doc) Rich text format (RTF) Portable document format (PDF) 2. You should choose a legible font and use double line-spacing. Your font should be no smaller than 11 pt font and no bigger than 12 pt font with standard margins. 3. Use hard returns only to end headings and paragraphs, not to rearrange lines. 4. All references must be numbered consecutively, in square brackets, in the order in which they are cited in the text, followed by any in tables or legends. 5. All pages should be numbered. 6. Greek and other special characters may be included. If you are unable to reproduce a particular special character, please type out the name of the symbol in full.

Thank you very much for your attention!