Basics of protein structure

Similar documents
CAP 5510 Lecture 3 Protein Structures

Introduction to" Protein Structure

Getting To Know Your Protein

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

D Dobbs ISU - BCB 444/544X 1

Protein Structure Basics

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds

Protein Structure Prediction and Display

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Orientational degeneracy in the presence of one alignment tensor.

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Secondary Structure Prediction

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics

From Amino Acids to Proteins - in 4 Easy Steps

Motif Prediction in Amino Acid Interaction Networks

Biomolecules: lecture 10

Physiochemical Properties of Residues

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

Molecular Modeling lecture 2

Introduction to Protein Folding

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.

Supersecondary Structures (structural motifs)

4 Proteins: Structure, Function, Folding W. H. Freeman and Company

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Analysis and Prediction of Protein Structure (I)

Protein Secondary Structure Prediction

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha

Bioinformatics. Macromolecular structure

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy

Denaturation and renaturation of proteins

BCB 444/544 Fall 07 Dobbs 1

Principles of Physical Biochemistry

Dana Alsulaibi. Jaleel G.Sweis. Mamoon Ahram

Announcements. Primary (1 ) Structure. Lecture 7 & 8: PROTEIN ARCHITECTURE IV: Tertiary and Quaternary Structure

Amino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1

Details of Protein Structure

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

2MHR. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity.

Protein Structure: Data Bases and Classification Ingo Ruczinski

NMR, X-ray Diffraction, Protein Structure, and RasMol

1. What is an ångstrom unit, and why is it used to describe molecular structures?

NMR BMB 173 Lecture 16, February

Determining Protein Structure BIBC 100

Presenter: She Zhang

Protein Structure Determination

Central Dogma. modifications genome transcriptome proteome

Protein Structures: Experiments and Modeling. Patrice Koehl

IT og Sundhed 2010/11

RNA and Protein Structure Prediction

Protein Structure Prediction Using Multiple Artificial Neural Network Classifier *

Lecture 26: Polymers: DNA Packing and Protein folding 26.1 Problem Set 4 due today. Reading for Lectures 22 24: PKT Chapter 8 [ ].

Model Mélange. Physical Models of Peptides and Proteins

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

Protein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1

What is the central dogma of biology?

Protein Structures. 11/19/2002 Lecture 24 1

3D Structure. Prediction & Assessment Pt. 2. David Wishart 3-41 Athabasca Hall

Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV

Review. Membrane proteins. Membrane transport

Pymol Practial Guide

Steps in protein modelling. Structure prediction, fold recognition and homology modelling. Basic principles of protein structure

BMB/Bi/Ch 173 Winter 2018

Protein Secondary Structure Prediction using Pattern Recognition Neural Network

Can protein model accuracy be. identified? NO! CBS, BioCentrum, Morten Nielsen, DTU

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

Biochemistry - I SPRING Mondays and Wednesdays 9:30-10:45 AM (MR-1307) Lectures 3-4. Based on Profs. Kevin Gardner & Reza Khayat

ALL LECTURES IN SB Introduction

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror

Improved Protein Secondary Structure Prediction

Section II Understanding the Protein Data Bank

Packing of Secondary Structures

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

Selecting protein fuzzy contact maps through information and structure measures

Protein Structure Marianne Øksnes Dalheim, PhD candidate Biopolymers, TBT4135, Autumn 2013

1) NMR is a method of chemical analysis. (Who uses NMR in this way?) 2) NMR is used as a method for medical imaging. (called MRI )

Objective: Students will be able identify peptide bonds in proteins and describe the overall reaction between amino acids that create peptide bonds.

SUPPLEMENTARY MATERIALS

Student Questions and Answers October 8, 2002

Scattering Lecture. February 24, 2014

BIOCHEMISTRY Unit 2 Part 4 ACTIVITY #6 (Chapter 5) PROTEINS

Protein Structure and Function. Protein Architecture:

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University

Bioinformatics: Secondary Structure Prediction

CHAPTER 29 HW: AMINO ACIDS + PROTEINS

1. (5) Draw a diagram of an isomeric molecule to demonstrate a structural, geometric, and an enantiomer organization.

Bio nformatics. Lecture 23. Saad Mneimneh

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

Transcription:

Today: 1. Projects a. Requirements: i. Critical review of one paper ii. At least one computational result b. Noon, Dec. 3 rd written report and oral presentation are due; submit via email to bphys101@fas.harvard.edu i. Account for role each group member played ii. Late policy: 5% lost per day late c. Grading rubric see attached 2. Protein structure a. Basics b. Experimental determination i. X-ray crystallography ii. NMR c. Classification d. Prediction: 2-D, 3D Basics of protein structure 1. Primary structure is the sequence of amino acids that compose the protein 2. Different regions of the sequence form local secondary structures, such as alpha helices and beta strands. 3. Tertiary structure is formed by packing secondary structural elements into one or several compact globular units called domains. 4. Final protein may contain several polypeptide chains arranged in quaternary structure. The secondary structure is formed through Hydrogen bonds between amino acids in the sequence Further interactions result in the formation of the tertiary structure (with help from chaperones, membrane proteins) Some motifs form distinct, highly predictable secondary structure Core of protein is more tightly packed and highly specialized than exterior 1

Protein Structure Elucidation X-ray crystallography The 3-D structure of a protein is determined by directing a beam of x-rays onto a regular, repeating array of many identical protein molecules (a crystal) which diffracts the x-rays. The resulting diffraction pattern can be used to determine the structure of the protein of interest. Crystallization of the protein of interest is usually difficult to achieve. The amplitudes and phases of the diffraction data from the protein crystals are used to calculate an electron-density map. The quality of the map depends on the resolution of the diffraction data, which in turn depends on how well-ordered the crystals are. The resolution is measured in Å (angstrom) units; the smaller this number is, the higher the resolution and therefore the greater the amount of detail that can be seen. 2

NMR (Nuclear Magnetic Resonance) This method uses the magnetic properties of atomic nuclei. This technique can be exploited to give information on the distances between atoms in a molecule, using atomic nuclei, such as 1 H 13 C, 15 N, and 31 P that have a magnetic moment or spin. X-ray Crystallography vs. NMR X-ray Crystallography 1. large structures possible (e.g. the ribosome) 2. crystallization parameters difficult to define, largely empirical 3. crystals hard to get, major bottleneck of method 4. proteins are packed in crystal NMR 1. only small structures (<~300 aa) 2. proteins are in solution Protein Classification Parameters: structural and sequence similarity Might get same 3D structure from very different sequences and species, and might get very different structures from highly similar sequences. Main structural classes: Class α: Bundle of α helices connected by loops on the protein surface Class β: Antiparallel β sheets Class α/β: Mixed helices and sheets Class α + β: Segregated helices and sheets Multidomain: Domains that fall into several categories 3-D structures determined via X-ray crystallography and NMR and deposited in Brookhaven Databank as a PDB entry http://www.rcsb.org/pdb/ 3

Protein Structure Prediction: Some proteins can be completely denatured and then renatured all of the information needed for proper folding is in the amino acid sequence. Forms of Secondary Structure Alpha helices, beta sheets highly constrained in space Loops connect regions of defined structure; less constrained so more substitutions and deletions can occur Coils catch-all term for regions that don t fit the above categories Prediction of Secondary Structure Assumption: There is a correlation between amino acid sequence and secondary structure A given short sequence is more likely to form one type of structure than another Chou-Fasman/GOR Method Chou-Fasman: Calculated frequencies of each amino acid in each type of secondary structure (helix, sheet, and loop) and use as predictive probabilities for novel sequences. Combine these frequencies to calculate probability that window of amino acids forms each type of structure; if probability is above threshold try to extend into larger regions of structure. GOR: Similar approach, but used 17-AA windows for probabilities and frequencies. Likely regions of structure determined using information theory and conditional probabilities. 4

Patterns of Hydrophobic Amino Acids Helices on protein surface have hydrophobic amino acids facing the core and hydrophilic facing the exterior, giving a periodic 2:1 ratio of hydrophilic to hydrophobic residues. Characteristic patterns are also observed in other well-known structures like leucine zippers, supercoils, intermembrane proteins, etc. Neural Network Models Computer programs are trained to recognize amino acid patterns that are located in known secondary structures and to distinguish these patterns from other patterns not located in these structures. Weights of units at each layer are adjusted until input yields optimal output given training set data. This is the most sophisticated method currently used, and is theoretically able to extract the most information out of the data. Nearest-Neighbor Methods Also uses machine learning: predicts the secondary structural conformation of an amino acid in the query sequence by identifying similar sequences of known structures. This is done using moving windows that slide along the query sequence; sequence within the window is compared to that in windows of known structures. Analysis of 3-D Structures: Compare sequences of known structures to identify other proteins that might have similar structural features. Multiple-sequence-alignments to identify motifs. Threading: Sequence of amino acids in a protein of unknown structure is tested for its ability to fit into a known 3-D structure. The size and chemistry of each amino acid s R group, and proximity to other R groups, are used as parameters for goodness of fit. Alignments of two sequences via regions of secondary structure: o Dynamic programming o Distance matrix o Fast alignment using similarities of α helices and β sheets e.g., VAST, SARF Significance of alignment: The significance is determined in a way analogous to BLAST s E-values. The number of superimposed secondary structural elements found when comparing two structures is contrasted with the number found if comparing random structures of the same size. Acknowledgement: This handout contained material written by Doug Selinger 5