COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

Similar documents
Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description. Version Document Published by the wwpdb

Appendices. Appendix I: PDB file format

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Protein Structure and Visualisation. Introduction to PDB and PyMOL

Bioinformatics. Macromolecular structure

CAP 5510 Lecture 3 Protein Structures

RNA and Protein Structure Prediction

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description. Version 3.0, December 1, 2006 Updated to Version 3.

Basics of protein structure

ALL LECTURES IN SB Introduction

Motif Prediction in Amino Acid Interaction Networks

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

Analysis and Prediction of Protein Structure (I)

What is the central dogma of biology?

Computational Molecular Modeling

Study of Mining Protein Structural Properties and its Application

Introduction to" Protein Structure

Orientational degeneracy in the presence of one alignment tensor.

Sequence analysis and comparison

Bi 8 Midterm Review. TAs: Sarah Cohen, Doo Young Lee, Erin Isaza, and Courtney Chen

Bio nformatics. Lecture 23. Saad Mneimneh

Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Protein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix)

Protein structure alignments

Syllabus BINF Computational Biology Core Course

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Combinatorial approaches to RNA folding Part I: Basics

Computational Biology: Basics & Interesting Problems

Introduction to Computational Structural Biology

D Dobbs ISU - BCB 444/544X 1

Sequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

Protein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models

Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1

Genomics and bioinformatics summary. Finding genes -- computer searches

Reconstructing Amino Acid Interaction Networks by an Ant Colony Approach

RNA & PROTEIN SYNTHESIS. Making Proteins Using Directions From DNA

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB)

Protein Structure Prediction Using Multiple Artificial Neural Network Classifier *

Part 7 Bonds and Structural Supports

Protein Structure Prediction, Engineering & Design CHEM 430

Protein Structure: Data Bases and Classification Ingo Ruczinski

Bonds and Structural Supports

STRUCTURAL BIOINFORMATICS. Barry Grant University of Michigan

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

Template Free Protein Structure Modeling Jianlin Cheng, PhD

From Amino Acids to Proteins - in 4 Easy Steps

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Supersecondary Structures (structural motifs)

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Protein Structure Basics

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds

Grundlagen der Bioinformatik Summer semester Lecturer: Prof. Daniel Huson

Structure-Based Comparison of Biomolecules

BIRKBECK COLLEGE (University of London)

Proteins: Structure & Function. Ulf Leser

From gene to protein. Premedical biology

Ranjit P. Bahadur Assistant Professor Department of Biotechnology Indian Institute of Technology Kharagpur, India. 1 st November, 2013

Dihedral Angles. Homayoun Valafar. Department of Computer Science and Engineering, USC 02/03/10 CSCE 769

Unit 1: Chemistry - Guided Notes

A General Model for Amino Acid Interaction Networks

Biophysics II. Key points to be covered. Molecule and chemical bonding. Life: based on materials. Molecule and chemical bonding

Protein Structure Prediction 11/11/05

BCH 4053 Spring 2003 Chapter 6 Lecture Notes

Objective: Students will be able identify peptide bonds in proteins and describe the overall reaction between amino acids that create peptide bonds.

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009

Getting To Know Your Protein

STRUCTURAL BIOINFORMATICS I. Fall 2015

Videos. Bozeman, transcription and translation: Crashcourse: Transcription and Translation -

Supplementary Figure 1. Aligned sequences of yeast IDH1 (top) and IDH2 (bottom) with isocitrate

Conditional Graphical Models

CS612 - Algorithms in Bioinformatics

Ant Colony Approach to Predict Amino Acid Interaction Networks

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig

Translation Part 2 of Protein Synthesis

Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence

Heteropolymer. Mostly in regular secondary structure

Algorithms in Computational Biology (236522) spring 2008 Lecture #1

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy

Homology Modeling. Roberto Lins EPFL - summer semester 2005

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

From Gene to Protein

1/40. Cellular mechanics I nd term

BCB 444/544 Fall 07 Dobbs 1

Lecture 26: Polymers: DNA Packing and Protein folding 26.1 Problem Set 4 due today. Reading for Lectures 22 24: PKT Chapter 8 [ ].

Computational Molecular Biology (

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror

Transcription:

COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University

General informations (1) Office hours: by appointment Office: TR3018 Contact: jerome.waldispuhl@mcgill.ca Web: Go to My Course

General informations (2) Evaluation: 2 assignments (15% each) 2 paper reports & presentations (10% each) 1 project (45%) Participation (5%)

General informations (3) Objective: Extends COMP462/561 Topics: Structural Bioinformatics & System Biology Background: Algorithmic, Programming & Basic knowledge in Molecular Biology Invited lectures

Central dogma of biology DNA Transfer RNA Transcription RNA Translation RNA-Protein interactions Ribosome Protein (Images not to scale) Etcetera Function Protein-protein interactions

The 3 components of the Bioinformatics 1. Genomic: Study of an organism's entire genome. Huge amount of data, limited to the sequence. 2. System Biology: Study of complex interactions in biological systems. High-level of representation, practical interests. 3. Computational Structural Biology: Study of the bio-molecule folding process. Lack of data in early year of bioinformatics, step toward the function, fill the gap between genomic and system biology.

Part 1 Computational Structural Biology

Modeling structures Protein RNA We introduce a intermediate representation (secondary structure) between the sequence (primary) and the 3D structure (tertiary).

Classification of structure & folding prediction methods Structure prediction Comparative/Homology modeling: similar sequences fold the same. Threading/Fold recognition: fold a sequence on a known 3D template. Ab-initio method: Sampling the conformational space. Folding pathway prediction: Molecular dynamics: simulation under known laws of physics. Motion planning: simulation of atomic robotic motions. Coarse grained model: Discrete modeling of the folding landscape

Protein Structure: amino acids The 20 amino acids. Building blocks of a protein. They differs by the nature of their side-chain (radical).

Proteins: Peptide bond The sequence of amino acids is called the primary structure

Protein secondary structure: α-helices Features: 3.6 amino acids per turn, hydrogen bond between residues n and n+4, local motif, approximately 40% of the structure.

Protein secondary structure: β-sheets Features: 2 amino acids per turn, hydrogen bond between residues of different strands, involve long-range interactions, approximately 20% of the structure.

Protein secondary structure: Turns Features: Up to 5 residue length, hydrogen bonds depend of type, local interactions, approximately 5-10% of the structure.

Secondary structure element are assembled together to form the tertiary structure. Complexes built from more than one chain form a quaternary structure.

RNA structure Base-pair Maximal planar representation (no crossing edges) of the graph of the base-pairs (watson-crick + wobble). (More details in the next lecture.)

Databases Protein Data bank: www.rcsb.org (3D structures) MSD-EBI: www.ebi.ac.uk/msd (3D structures) PDBj: www.pdbj.org (3D structures) UniProtKB/Swiss-Prot: expasy.org/sprot (annotated protein) CATH: cathdp.info (structure classification) SCOP: scop.mrc-lmb.cam.ac.uk/scop (structure classification) BMRB: www.bmrb.wisc.edu (NMR) NDB: ndbserver.rutgers.edu (ARNs)

Protein Data Bank

PDB format Keywords: SEQRES: amino acid or nucleic acid sequence. MODRES: descriptions of modifications to residues. HELIX: identify the position of helices in the molecule. SHEET: position of sheets in the molecule. TURN: identify turns and other short loop turns. ATOM: atomic coordinates for standard residues. HETATM: atomic coordinate of atoms within "non-standard" groups. CONECT: connectivity between atoms for which coordinates are supplied. HYDBND: specify hydrogen bonds in the entry. SSBOND: disulfide bond.

PDB format (2) COLUMNS DATATYPE FIELD DEFINITION 1-6 Record name "ATOM " 7-11 Integer serial Atom serial number. 13-16 Atom name Atom name. 17 Character altloc Alternate location indicator. 18-20 Residue name resname Residue name. 22 Character chainid Chain identifier. 23-26 Integer resseq Residue sequence number. 27 Char icode Code for insertion of residues. 31-38 Real(8.3) x Orthogonal coordinates for X in Angstroms. 39-46 Real(8.3) y Orthogonal coordinates for Y in Angstroms. 47-54 Real(8.3) z Orthogonal coordinates for Z in Angstroms. 55-60 Real(6.2) occupancy Occupancy. 61-66 Real(6.2) tempfactor Temperature factor. 73-76 LString(4) segid Segment identifier, left-justified. 77-78 LString(2) element Element symbol, right-justified. 79-80 LString(2) charge Charge on the atom.

Classical secondary structure prediction algorithms. Lecture 2: Classical secondary structure prediction algorithms. Lecture 3: RNA sequence/structure alignment. Lecture 4: Stochastic secondary structure prediction. RNA dotplot

Extended secondary structures Lecture 5: RNA saturated secondary structures and RNA shapes. Lecture 6: RNA secondary structures with pseudoknots, RNA-RNA interaction. RNA-RNA interaction Pseudo-knotted RNA secondary structure:

Lecture 7-9: Theoretical studies in the RNA secondary structure model Lecture 7: Grammatical modeling of RNA structures. Lecture 8: Asymptotics of RNA secondary structures Lecture 9: Evolution, neutral network. Lecture 10: Synthetic Biology, RNA design. Connected neutral network Grammatical modeling of RNA structure

Lecture 11-13: Advanced topics Lecture 11: RNA 3D structure modeling, alignment and prediction. Lecture 12: Genomic identification of structural RNAs Lecture 13: RNA folding kynetics

Lecture 14-15: 3D modeling and simulation Lecture 14: Introduction to protein structure prediction. & Conformational search and Molecular Dynamics. Lecture 15: Threading, fragment assembly, side-chain packing.

Lecture 16-18: template based predictions Lecture 16: Protein secondary structure prediction. Lecture 17: Language theory as a tool for protein structure modeling and prediction. Lecture 18: Transmembrane proteins. HMM modeling of transmembrane beta-barrel (Bigelow et al., 2010)

Lecture 19-21: Folding pathways Lecture 19: Protein folding on a lattice models. Lecture 20: Residue contact prediction & folding pathways. Lecture 21: Integrative methods. Protein folding in HP model Folding landscape

Part 1 System Biology

Protein-protein interaction networks

Gene interaction network (Magtanong et al., 2010)

Lecture 22-23: Algorithms for interaction network Lecture 22: Modeling interaction network Lecture 23: Networks alignments & evolution. IsoRank (Singh et al., 2008)

Lecture 24: Unifying Structural & System Biology (Lieberman-Aiden et al., 2009)