Bioinformatics. Macromolecular structure

Similar documents
Protein Structure: Data Bases and Classification Ingo Ruczinski

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB)

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Basics of protein structure

STRUCTURAL BIOINFORMATICS I. Fall 2015

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

Protein Structure & Motifs

2MHR. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity.

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

CAP 5510 Lecture 3 Protein Structures

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

HOMOLOGY MODELING. The sequence alignment and template structure are then used to produce a structural model of the target.

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Protein Structure Analysis and Verification. Course S Basics for Biosystems of the Cell exercise work. Maija Nevala, BIO, 67485U 16.1.

ALL LECTURES IN SB Introduction

Protein structure analysis. Risto Laakso 10th January 2005

Protein Structure Prediction

Protein Structure Prediction, Engineering & Design CHEM 430

Protein Structure Determination. Why Bother With Structure? Protein Sequences Far Outnumber Structures. Growth of Structural Data

Getting To Know Your Protein

Analysis and Prediction of Protein Structure (I)

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Secondary and sidechain structures

Chapter 2 Structures. 2.1 Introduction Storing Protein Structures The PDB File Format

Protein Structure Basics

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

Protein Structure Determination. Why Bother With Structure? Protein Sequences Far Outnumber Structures

Protein Structure Determination. How are these structures determined?

NMR, X-ray Diffraction, Protein Structure, and RasMol

Macromolecular X-ray Crystallography

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.

CS612 - Algorithms in Bioinformatics

Structure to Function. Molecular Bioinformatics, X3, 2006

Orientational degeneracy in the presence of one alignment tensor.

Protein structure alignments

Physiochemical Properties of Residues

Biomolecules: lecture 10

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

Introduction to" Protein Structure

Motif Prediction in Amino Acid Interaction Networks

Modeling for 3D structure prediction

Working with protein structures. Benjamin Jack

Protein Structure Determination

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University

Presenter: She Zhang

Study of Mining Protein Structural Properties and its Application

Lecture 26: Polymers: DNA Packing and Protein folding 26.1 Problem Set 4 due today. Reading for Lectures 22 24: PKT Chapter 8 [ ].

Protein Structures: Experiments and Modeling. Patrice Koehl

Protein Structure and Function. Protein Architecture:

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Protein Structures. 11/19/2002 Lecture 24 1

Homology Modeling. Roberto Lins EPFL - summer semester 2005

Supporting Online Material for

Protein Modeling. Generating, Evaluating and Refining Protein Homology Models

Protein Structure Prediction 11/11/05

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions

Supersecondary Structures (structural motifs)

BCH 4053 Spring 2003 Chapter 6 Lecture Notes

Biophysics 101: Genomics & Computational Biology. Section 8: Protein Structure S T R U C T U R E P R O C E S S. Outline.

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1

D Dobbs ISU - BCB 444/544X 1

Protein Modeling Methods. Knowledge. Protein Modeling Methods. Fold Recognition. Knowledge-based methods. Introduction to Bioinformatics

7.91 Amy Keating. Solving structures using X-ray crystallography & NMR spectroscopy

Bioinformatics. Proteins II. - Pattern, Profile, & Structure Database Searching. Robert Latek, Ph.D. Bioinformatics, Biocomputing

DATE A DAtabase of TIM Barrel Enzymes

An integrated software environment for protein structure refinement

Biochemistry,530:,, Introduc5on,to,Structural,Biology, Autumn,Quarter,2015,

From Amino Acids to Proteins - in 4 Easy Steps

RNA and Protein Structure Prediction

Proteins: Structure & Function. Ulf Leser

Supplementary Figure 1. Aligned sequences of yeast IDH1 (top) and IDH2 (bottom) with isocitrate

Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics

Type II Kinase Inhibitors Show an Unexpected Inhibition Mode against Parkinson s Disease-Linked LRRK2 Mutant G2019S.

Molecular Graphics with PyMOL

Amino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1

Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig

Principles of Physical Biochemistry

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009

Homology modeling of Ferredoxin-nitrite reductase from Arabidopsis thaliana

CS612 - Algorithms in Bioinformatics

SUPPLEMENTARY INFORMATION

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror

Assignment 2 Atomic-Level Molecular Modeling

7 Protein secondary structure

Homology modeling. Dinesh Gupta ICGEB, New Delhi 1/27/2010 5:59 PM

Building 3D models of proteins

Contents. xiii. Preface v

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter

8 Protein secondary structure

Transcription:

Bioinformatics Macromolecular structure

Contents Determination of protein structure Structure databases Secondary structure elements (SSE) Tertiary structure Structure analysis Structure alignment Domain recognition Structure prediction Homology modelling Threading/folder recognition Secondary structure prediction ab initio prediction

Structure Determination of protein structure Jacques van Helden jvanheld@ucmb.ulb.ac.be

Crystallisation Hanging drop method / vapour diffusion method 1-Dilute protein solution Microscope slide Microscope 2-Concentrated salt solution many different conditions of 1&2 must be tried Crystal Slide courtesy from Shoshana Wodak

Determination of protein structure Diffraction pattern Atomic model Slide courtesy from Shoshana Wodak

The resolution problem q q q A high resolution protein structure : 1.5-2.0 Å resolution Slide courtesy from Shoshana Wodak

Nuclear Magnetic Resonance (NMR) Source: Branden & Tooze (1991)

Interatomic forces Covalent interactions Hydrogen bonds Hydrophobic/hydrophilic interactions Ionic interactions van der Waals force Repulsive forces

Structure Structure databases Jacques van Helden jvanheld@ucmb.ulb.ac.be

Structure databases PDB (Protein database) Official structure repository SCOP (Stuctural Classification Of Proteins) Structure classification. Top level reflect structural classes.the second level, called Fold, includes topological and similarity criteria. CATH (Class, Architecture, Topology and Homologous superfamily)

PDB entry header HEADER TRANSCRIPTION REGULATION 06-MAR-92 1D66 1D66 2 COMPND GAL4 (RESIDUES 1-65) COMPLEX WITH 19MER DNA 1D66 3 SOURCE (SACCHAROMYCES $CEREVISIAE) OVEREXPRESSED IN (ESCHERICHIA 1D66 4 SOURCE 2 $COLI) 1D66 5 AUTHOR R.MARMORSTEIN,S.HARRISON 1D66 6 REVDAT 1 15-APR-93 1D66 0 1D66 7 JRNL AUTH R.MARMORSTEIN,M.CAREY,M.PTASHNE,S.C.HARRISON 1D66 8 JRNL TITL /DNA$ RECOGNITION BY /GAL4$: STRUCTURE OF A 1D66 9 JRNL TITL 2 PROTEIN(SLASH)/DNA$ COMPLEX 1D66 10 JRNL REF NATURE V. 356 408 1992 1D66 11 JRNL REFN ASTM NATUAS UK ISSN 0028-0836 006 1D66 12 REMARK 1 1D66 13 REMARK 2 1D66 14 REMARK 2 RESOLUTION. 2.7 ANGSTROMS. 1D66 15 REMARK 3 1D66 16 REMARK 3 REFINEMENT. 1D66 17 REMARK 3 PROGRAM CORELS;TNT;XPLOR 1D66 18 REMARK 3 AUTHORS J.SUSSMAN;D.TRONRUD;A.BRUNGER 1D66 19 REMARK 3 R VALUE 0.230 1D66 20 REMARK 3 RMSD BOND DISTANCES 0.015 ANGSTROMS 1D66 21 REMARK 3 RMSD BOND ANGLES 2.9 DEGREES 1D66 22 REMARK 4 1D66 23 REMARK 4 THERE ARE TWO DNA CHAINS WHICH HAVE BEEN ASSIGNED CHAIN 1D66 24 REMARK 4 INDICATORS *D* AND *E*. THERE ARE TWO PROTEIN CHAINS 1D66 25 REMARK 4 WHICH HAVE BEEN ASSIGNED CHAIN INDICATORS *A* AND *B*. 1D66 26 REMARK 4 EACH PROTEIN - DNA COMPLEX CONTAINS FOUR BOUND CD IONS. 1D66 27...

CATH - A protein domain classification In CATH, protein domains are classified according to a tree with 4 levels of hierarchically Class Architecture Topology Homology Class Architecture Topology Figure from Shoshana Wodak

Classifications of protein structures (domains) CATH: structural classification of proteins, [http://www.biochem.ucl.ac.uk/bsm/cath/] SCOP: Structural classification of proteins [http://scop.mrc-lmb.cam.ac.uk/scop/] FSSP:Fold classification based on structure alignments [http://www.sander.ebi.ac.uk/fssp/] HSSP: Homology derived secondary structure assignments [http://www.sander.ebi.ac.uk/hssp/] DALI:Classification of protein domains [http://www.ebi.ac.uk/dali/domain/] VAST: structural neighbours by direct 3D structure comparison [http://www.ncbi.nlm.nih.gov:80/structure/vast/vast.shtml] CE: Structure comparisons by Combinatorial Extension [http://cl.sdsc.edu/ce.html] Slide courtesy from Shoshana Wodak

Books Branden, C. & Tooze, J. (1991). Introduction to protein structure. 1 edit, Garland Publishing Inc., New York and London. Westhead, D.R., J.H. Parish, and R.M. Twyman. 2002. Bioinformatics. BIOS Scientific Publishers, Oxford. Mount, M. (2001). Bioinformatics: Sequence and Genome Analysis. 1 edit. 1 vols, Cold Spring Harbor Laboratory Press, New York. Gibas, C. & Jambeck, P. (2001). Developing Bioinformatics Computer Skills, O'Reilly.

Structure Secondary structure elements Jacques van Helden jvanheld@ucmb.ulb.ac.be

Secondary structure - α-helix Carbon Nitrogen Oxygen hydrogen bond 3.6 residues Source: Branden & Tooze (1991)

Hydrophobicity of side-chain residues in helices Blue: polar Red: basic or acidic Source: Branden & Tooze (1999)

Secondary structure - β sheets Antiparallel Parallel Source: Branden & Tooze (1991)

Secondary structure - twist of β sheets Mixed β sheet Source: Branden & Tooze (1991)

Angles of rotation Each dipeptide unit is characterized by two angles of rotation Phi Psi around the N-Calpha bond around the Calpha-C bond Image from Branden & Tooze (1999)

The Ramachandran map Dipeptide unit Dipeptide unit Slide courtesy from Shoshana Wodak

Structure Tertiary structure Jacques van Helden jvanheld@ucmb.ulb.ac.be

Combinations of secondary structures Retinol binding protein (PDB:1rpb) β-sheet α-helix loop

Bioinformatics Analysis of structure Jacques van Helden jvanheld@ucmb.ulb.ac.be

Structure-structure alignment and comparison Structure A Structure B Question: Is structure A similar to structure B? Approach: structure alignments Slide courtesy from Shoshana Wodak

Analyzing conformational changes Open form Closed form Citrate synthase, ligand induced conformational changes Domain motion and small structural distortions Slide courtesy from Shoshana Wodak

Defining Domains: What for? Link between domain structure and function Different structural domains can be associated with different functions Enzyme active sites are often at domain interfaces; domain movements play a functional role DNA Methyltransferase Cathepsin D Slide courtesy from Shoshana Wodak

Methods for Identifying Domains Underlying principle Domain limits are defined by identifying groups of residues such that the number of contacts between groups is minimized. N N C C 1-cut 4-cuts N C 2-cuts Slide courtesy from Shoshana Wodak

Lactate dehydrogenase Domains From Contact Map Slide courtesy from Shoshana Wodak

Structure Structure prediction Jacques van Helden jvanheld@ucmb.ulb.ac.be

Methods for structure prediction Homology modelling Building a 3D model on the basis of similar sequences Threading Threading the sequence on all known protein structures, and testing the consistency Secondary structure prediction ab initio prediction of tertiary structure For proteins of normal size, it is almost impossible to predict structures ab initio. Some results have been obtained in the prediction of oligopeptide structures.

Homology modelling - steps Similarity search Modelling of backbone Secondary structure elements Loops Modelling of side chains Refinement of the model Verification Steric compatibility of the residues

Homology modelling - similarity search Starting from a query sequence, search for similar sequences with known structure. Search for similar sequences in a database of protein structures. Multiple alignment. A weight can be assigned to each matching protein (higher score to more similar proteins) The higher is the sequence similarity, the more accurate will be the predicted structure. When one disposes of structure for proteins with >70% similarity with the query, a good model can be expected. When the similarity is <40%, homology modeling gives poor results. The lack of available structures constitutes one of the main limitations to homology modeling In 2004, PDB contains

Homology modelling - Backbone modelling Modelling of secondary structure elements a-helices b-sheets For each secondary structure element of the template, align the backbone of query and template. Loop modelling Databases of loop regions Loop main chain depends on number of aa and neighbour elements (a-a, a-b, b-a, b-b)

Homology modelling - Side chain modelling Side-chain conformation (model building and energy refinement) Conserved side chains take same coordinates as in the template. For non-conserved side chains, use rotamer libraries to determine the most favourable conformation.

Homology modelling - refinement After the steps above have been completed, the model can be refined by modifying the positions of some atoms in order to reduce the energy.