Computational Molecular Modeling

Similar documents
Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror

Ranjit P. Bahadur Assistant Professor Department of Biotechnology Indian Institute of Technology Kharagpur, India. 1 st November, 2013

From Amino Acids to Proteins - in 4 Easy Steps

ALL LECTURES IN SB Introduction

Protein Structures. 11/19/2002 Lecture 24 1

Ch 3: Chemistry of Life. Chemistry Water Macromolecules Enzymes

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

CAP 5510 Lecture 3 Protein Structures

Basics of protein structure

Biomolecules. Energetics in biology. Biomolecules inside the cell

Molecular Modeling lecture 2

1. (5) Draw a diagram of an isomeric molecule to demonstrate a structural, geometric, and an enantiomer organization.

Introduction to" Protein Structure

Useful background reading

Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description. Version Document Published by the wwpdb

HOMOLOGY MODELING. The sequence alignment and template structure are then used to produce a structural model of the target.

Protein Bioinformatics Computer lab #1 Friday, April 11, 2008 Sean Prigge and Ingo Ruczinski

Protein Structures: Experiments and Modeling. Patrice Koehl

The Structure and Functions of Proteins

Biomolecules: lecture 9

Biochemistry,530:,, Introduc5on,to,Structural,Biology, Autumn,Quarter,2015,

Analysis and Prediction of Protein Structure (I)

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig


Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics

Supporting Online Material for

Lecture 2-3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

4 Proteins: Structure, Function, Folding W. H. Freeman and Company

Protein structure alignments

Proteins are not rigid structures: Protein dynamics, conformational variability, and thermodynamic stability

Protein Structure Basics

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

Biological Macromolecules

Unit 1: Chemistry - Guided Notes

Protein Structure Bioinformatics Introduction

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

Week 10: Homology Modelling (II) - HHpred

The biomolecules of terrestrial life

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

Details of Protein Structure

7.91 Amy Keating. Solving structures using X-ray crystallography & NMR spectroscopy

X-Ray structure analysis

Full file at

CHEM 463: Advanced Inorganic Chemistry Modeling Metalloproteins for Structural Analysis

1/23/2012. Atoms. Atoms Atoms - Electron Shells. Chapter 2 Outline. Planetary Models of Elements Chemical Bonds

Biochemistry 530: Introduction to Structural Biology. Autumn Quarter 2014 BIOC 530

2: CHEMICAL COMPOSITION OF THE BODY

Protein Structure Prediction

Template Based Protein Structure Modeling Jianlin Cheng, PhD

Visualization of Macromolecular Structures

BME Engineering Molecular Cell Biology. Structure and Dynamics of Cellular Molecules. Basics of Cell Biology Literature Reading

D Dobbs ISU - BCB 444/544X 1

What is the central dogma of biology?

The protein folding problem consists of two parts:

Model Mélange. Physical Models of Peptides and Proteins

Protein Structure: Data Bases and Classification Ingo Ruczinski

Structure Determination by NMR Spectroscopy #3-1

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Teacher Instructions

7.88J Protein Folding Problem Fall 2007

Chapter 25 Organic and Biological Chemistry

NMR, X-ray Diffraction, Protein Structure, and RasMol

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Dihedral Angles. Homayoun Valafar. Department of Computer Science and Engineering, USC 02/03/10 CSCE 769

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Chapter 1 1) Biological Molecules a) Only a small subset of the known elements are found in living systems i) Most abundant- C, N, O, and H ii) Less

Contents. xiii. Preface v

Protein Structure Marianne Øksnes Dalheim, PhD candidate Biopolymers, TBT4135, Autumn 2013

Human Biology. The Chemistry of Living Things. Concepts and Current Issues. All Matter Consists of Elements Made of Atoms

Protein structures and comparisons ndrew Torda Bioinformatik, Mai 2008

Principles of Physical Biochemistry

SUPPLEMENTARY INFORMATION

Today in Astronomy 106: the long molecules of life

BIOCHEMISTRY GUIDED NOTES - AP BIOLOGY-

2) Matter composed of a single type of atom is known as a(n) 2) A) element. B) mineral. C) electron. D) compound. E) molecule.

April, The energy functions include:

Ch. 2 BASIC CHEMISTRY. Copyright 2010 Pearson Education, Inc.

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV

Problem Set 1

Practice Midterm Exam 200 points total 75 minutes Multiple Choice (3 pts each 30 pts total) Mark your answers in the space to the left:

09/06/25. Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Non-uniform distribution of folds. Scheme of protein structure predicition

Central Dogma. modifications genome transcriptome proteome

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

Full wwpdb X-ray Structure Validation Report i

Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence

SUPPLEMENTARY INFORMATION

Introduction to Computational Structural Biology

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

PDBe TUTORIAL. PDBePISA (Protein Interfaces, Surfaces and Assemblies)

Properties of amino acids in proteins

BIBC 100. Structural Biochemistry

BIOC 530 Fall, 2011 BIOC 530

THE UNIVERSITY OF MANITOBA. PAPER NO: 409 LOCATION: Fr. Kennedy Gold Gym PAGE NO: 1 of 6 DEPARTMENT & COURSE NO: CHEM 4630 TIME: 3 HOURS

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Transcription:

Computational Molecular Modeling Lecture 1: Structure Models, Properties Chandrajit Bajaj

Today s Outline Intro to atoms, bonds, structure, biomolecules, Geometry of Proteins, Nucleic Acids, Ribosomes, Viruses Space Occupancy, Bonded, Non-Bonded Areas, Volumes, Derivatives Dynamic Maintenance and Bioinformatics

The Tree of Life! The World of the Cell, 1996) Eukaryotic cells Viruses? Ribosome Prokaryotic cell The problems of chemistry and biology can be greatly helped if our ability to see what we are doing, and to do things on an atomic level, is ultimately developed - a development which I think cannot be avoided Richard Feynman, 1959 CalTech

The Tree of Life! The World of the Cell, 1996) Eukaryotic cells Viruses? Ribosome Prokaryotic cell The problems of chemistry and biology can be greatly helped if our ability to see what we are doing, and to do things on an atomic level, is ultimately developed - a development which I think cannot be avoided Richard Feynman, 1959 CalTech

X-ray diffraction analysis for Atomic Resolution Structure Determination X-ray crystallography (diffraction) Atomic resolution Difficulties (experimental, computational) human deoxy-hemoglobin Protein Data Bank

Xray Crystallography- elucidating structure Ø Periodicity and Symmetry in a Crystal Ø Diffraction Pattern and Bragg s Law Ø Reciprocal Space and Fourier Transform Ø Phase Problem and Solutions Ø Fitting, Refinement, and Validation Crystal à Diffraction pattern à Electron density à Model

Molecular Structure of Hemoglobin secondary, tertiary, quaternary structure One myoglobin chain contains eight α-helices and no β -sheets. Nobel Prize

The PDB file ATOM 1 N GLU A 27 41.211 44.533 94.570 1.00 85.98 ATOM 2 CA GLU A 27 42.250 44.748 95.621 1.00 86.10 ATOM 3 C GLU A 27 42.601 43.408 96.271 1.00 85.99 ATOM 4 O GLU A 27 43.691 42.865 96.065 1.00 85.71 ATOM 5 CB GLU A 27 41.725 45.720 96.687 1.00 86.36 ATOM 6 CG GLU A 27 42.804 46.349 97.563 1.00 86.44 ATOM 7 CD GLU A 27 43.628 47.387 96.817 1.00 86.98 ATOM 8 OE1 GLU A 27 44.194 47.051 95.754 1.00 87.40 ATOM 9 OE2 GLU A 27 43.713 48.540 97.296 1.00 87.02 ATOM 10 N ARG A 28 41.662 42.882 97.053 1.00 85.65 ATOM 11 CA ARG A 28 41.839 41.607 97.739 1.00 85.29 ATOM 12 C ARG A 28 41.380 40.458 96.835 1.00 85.31 ATOM 13 O ARG A 28 42.184 39.619 96.424 1.00 85.09 ATOM 14 CB ARG A 28 41.035 41.607 99.045 1.00 84.62 ATOM 15 CG ARG A 28 39.564 41.944 98.851 1.00 84.07 ATOM 16 CD ARG A 28 38.845 42.152 100.169 1.00 84.00 ATOM 17 NE ARG A 28 37.423 42.439 99.980 1.00 84.27 ATOM 18 CZ ARG A 28 36.945 43.413 99.208 1.00 84.53 ATOM 19 NH1 ARG A 28 37.771 44.208 98.537 1.00 83.83 ATOM 20 NH2 ARG A 28 35.634 43.598 99.111 1.00 84.38...

PDB -> 2--> PQR Replace the temperature and occupancy columns with per-atom charge (Q) and radius (R) for a PDB file. Field Atom_num Atom_name Res_name Chain_ID Res_numr X Y Z temp occupy ATOM 76368 CB LYS L 57 87.677 124.547 7.349 1.00 35.51 C ATOM 76369 CG LYS L 57 86.549 125.304 6.741 1.00 37.35 C ATOM 76370 CD LYS L 57 85.427 124.333 6.451 1.00 38.17 C Field Atom_num Atom_name Res_name Chain_ID Res_numr X Y Z Charge Radius ATOM 76368 CB LYS L 57 87.677 124.547 7.349 0.211 1.908 C ATOM 76369 CG LYS L 57 86.549 125.304 6.741-0.303 1.908 C ATOM 76370 CD LYS L 57 85.427 124.333 6.451 0.799 1.908 C Two widely used approaches: PDB2PQR and Amber

What is the atomic charge? Atomic Charges Based on atomic electronegativity, optimized for a given Force Field. example: Gasteiger charges. Based on atomic electronegativity and the resulting electrical field. example: Charge Equilibrium charges (QEq). * Based on the electronic distribution calculated by QM. example: Mulliken charges. Based on the electrostatic potential near the molecule, calculated by a non-empirical method (or determined experimentally). examples: Chelp, ChelpG, RESP. Center for Computational Visualization Institute for Computational and Engineering Sciences Sep 2011

Proteins Amino acids contain an amide, a residue and a carboxyl group Proteins are polypeptide chains, made from amino acids combined via peptide bonds. H H N R Cα H C O OH N H R Cα H C O H N H Cα R O C

Amino Acids (I) Unlabeled atoms are either carbon or hydrogen. C alpha atoms are shaded. Double bonds and partially double bonds are shown in bold.

Amino Acids (II) Unlabeled atoms are either carbon or hydrogen. C alpha atoms are shaded. Double bonds and partially double bonds are shown in bold.

Protein Geometry: Backbone + SideChains

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

Figure Phil Bradley: pbradley@fhcrc.org

RNA:Ribo-Nucleic Acids α β P H5 O5 γ H4 C5 δ Torsion angles C4 H3 H5 C3 ε O3 ζ P Phosphoric acid Adenine, Guanine, Cytosine, Uracil. O4 C2 H2 χ C1 H2 Base C5 Nucleotide C4 Sugar C3 C1 C2 Phosphoric acid Can also specify ribose dihedral angles and puckering phase, amplitude Base C5 RNA polymer Nucleotide Base Sugar C1 C4 C3 C2

Bases

Figure Phil Bradley: pbradley@fhcrc.org

Molecular Models I Solvent molecule modeled as a probe sphere. Water: radius 1.4A Probe sphere SAS 1NT5 Molecule VDW SES/SCS SAS: solvent accessible : locus of probe center VDW: van der Waals: Union of spheres with VDW radii SES/SCS: solvent excluded/solvent contact --- Molecular surface

Power Diagram & Union of Balls (vdw) Union of disks (CPK) Laguerre Voronoi (Power) Diagram Regular Triangulation Skeletal Complex

Laguerre Geometry & Union of Balls H. Edelsbrunner. The union of balls and its dual shape. Disc. Comput. Geom., 13:415 440, 1995. C. Bajaj, V. Pascucci, A. Shamir, R. Holt, A. Netravali Dynamic Maintenance and Visualization of Molecular Surfaces Discrete Applied Mathematics 127 (2003). Pages 23-51.

Adaptive Grids & Union of Balls Legend Gridpoint: l VDW (red) l SAS (green) l OUT (unmarked) Gridcell: l Buried (brown) l VDWBoundary (light green) l SASBand (dark blue) l SASBoundary (light blue) l Out (white) l Gridpoint classes l Gridcell classes

Adap. Grids & Updating Union of Balls Add l l l l Gridpoints can be reclassified l SAS -> VDW, OUT -> SAS, OUT ->VDW Gridcell classification is also changed based on new classification of gridpoints The new atom is marked as exposed if its insertion resulted in marking a gridpoint as SAS Previously exposed atoms intersecting the new atom are marked buried if their SAS volume does not contain any gridpoint marked SAS l Add is O(1) under the assumption that the grid-spacing h = O(r s ) and r s = O(r max )

Adap. Grids & Updating Union of Balls Remove l l l Gridpoints can be reclassified l SAS -> OUT, VDW -> SAS, VDW ->OUT Gridcell classification is also changed based on new classification of gridpoints Previously buried atoms intersecting the removed atom can become exposed if their SAS volume now contain any gridpoint marked SAS l Removal is O(1) under the assumption that the grid-spacing h = O(r s ) and r s = O(r max )

Structural Properties _I

Structural Properties _II

Structural Properties _III We shall correct this critical problem in the next lecture on Smooth Structural Interfaces!

Challenge #2: Bimolecular Models in Solvent Computational Problems Smooth Interfaces Parameterization Area and Volume Derivatives Dynamic Updates Techniques nfft for fast summations Fast Dynamic Particle Maintenance Integrals, Quadrature Challenges Protein Flexibility Protein Folding, Rotamer Packing Spontaneous Assembly NEXT Aperiodic Quasi-crystals or Quasi-lattices

Some Useful Software Links Some useful Structure Links The molecular energetics and docking client /viewer TexMol and a user-manual The molecular viewer PyMOL and this tutorial The program MODELLER. Some useful Databases/Servers The PDB, the repository of experimentally determined protein structures. NCBI PSI-Blast. MUSCLE protein multiple sequence alignment server. PredictProtein protein sequence analysis web server: secondary structure prediction, coiled-coils, transmembrane helices, fold recognition,... PROSITE protein sequence patterns. SCOP structure classification database. matrix2png, a handy bioinformatics tool. Bioinfo MetaServer, consensus fold recognition server.