D Dobbs ISU - BCB 444/544X 1

Similar documents
Basics of protein structure

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Physiochemical Properties of Residues

Protein Structure Prediction 11/11/05

Announcements. Primary (1 ) Structure. Lecture 7 & 8: PROTEIN ARCHITECTURE IV: Tertiary and Quaternary Structure

Supersecondary Structures (structural motifs)

Introduction to" Protein Structure

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions

The Structure and Functions of Proteins

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

BCH 4053 Spring 2003 Chapter 6 Lecture Notes

Protein Structure Basics

BCB 444/544 Fall 07 Dobbs 1

BIRKBECK COLLEGE (University of London)

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds

Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics

Protein Structure & Motifs

BCB 444/544 Fall 07 Dobbs 1

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

Biomolecules: lecture 10

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

From Amino Acids to Proteins - in 4 Easy Steps

Section Week 3. Junaid Malek, M.D.

Problem Set 1

Getting To Know Your Protein

Introduction to Protein Folding

Protein Structure and Function. Protein Architecture:

Biology Chemistry & Physics of Biomolecules. Examination #1. Proteins Module. September 29, Answer Key

Read more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov

Details of Protein Structure

Lecture 10 (10/4/17) Lecture 10 (10/4/17)

Analysis and Prediction of Protein Structure (I)

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

1. What is an ångstrom unit, and why is it used to describe molecular structures?

Heteropolymer. Mostly in regular secondary structure

Basic Principles of Protein Structures

Dana Alsulaibi. Jaleel G.Sweis. Mamoon Ahram

4 Proteins: Structure, Function, Folding W. H. Freeman and Company

Motif Prediction in Amino Acid Interaction Networks

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

Lecture 26: Polymers: DNA Packing and Protein folding 26.1 Problem Set 4 due today. Reading for Lectures 22 24: PKT Chapter 8 [ ].

Protein Structure Prediction and Display

Packing of Secondary Structures

BCMP 201 Protein biochemistry

Bi 8 Midterm Review. TAs: Sarah Cohen, Doo Young Lee, Erin Isaza, and Courtney Chen

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror

Protein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1

ALL LECTURES IN SB Introduction

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Conformational Geometry of Peptides and Proteins:

LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor

Central Dogma. modifications genome transcriptome proteome

Protein Structure: Data Bases and Classification Ingo Ruczinski

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

What is the central dogma of biology?

DATE A DAtabase of TIM Barrel Enzymes

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Enzyme Catalysis & Biotechnology

Model Mélange. Physical Models of Peptides and Proteins

RNA and Protein Structure Prediction

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig

The Structure of Enzymes!

The Structure of Enzymes!

Biochemistry Quiz Review 1I. 1. Of the 20 standard amino acids, only is not optically active. The reason is that its side chain.

Secondary and sidechain structures

Protein Structure Bioinformatics Introduction

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

Amino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1

Biomolecules. Energetics in biology. Biomolecules inside the cell

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB)

Chem. 27 Section 1 Conformational Analysis Week of Feb. 6, TF: Walter E. Kowtoniuk Mallinckrodt 303 Liu Laboratory

Tamer Barakat. Razi Kittaneh. Mohammed Bio. Diala Abu-Hassan

Protein structure alignments

Charged amino acids (side-chains)

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

Peptides And Proteins

Biomolecules: lecture 9

2MHR. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity.

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

Protein Folding & Stability. Lecture 11: Margaret A. Daugherty. Fall How do we go from an unfolded polypeptide chain to a

Basic structures of proteins

1. (5) Draw a diagram of an isomeric molecule to demonstrate a structural, geometric, and an enantiomer organization.

1. Amino Acids and Peptides Structures and Properties

PROTEIN STRUCTURE AMINO ACIDS H R. Zwitterion (dipolar ion) CO 2 H. PEPTIDES Formal reactions showing formation of peptide bond by dehydration:

Properties of amino acids in proteins

Student Questions and Answers October 8, 2002

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Dental Biochemistry EXAM I

A) at equilibrium B) endergonic C) endothermic D) exergonic E) exothermic.

Bioinformatics. Macromolecular structure

THE UNIVERSITY OF MANITOBA. PAPER NO: _1_ LOCATION: 173 Robert Schultz Theatre PAGE NO: 1 of 5 DEPARTMENT & COURSE NO: CHEM / MBIO 2770 TIME: 1 HOUR

Solutions and Non-Covalent Binding Forces

Protein Struktur (optional, flexible)

Biological Macromolecules

2: CHEMICAL COMPOSITION OF THE BODY

Ch 3: Chemistry of Life. Chemistry Water Macromolecules Enzymes

Transcription:

11/7/05 Protein Structure: Classification, Databases, Visualization Announcements BCB 544 Projects - Important Dates: Nov 2 Wed noon - Project proposals due to David/Drena Nov 4 Fri PM - Approvals/responses & tentative presentation schedule to students Dec 2 Fri noon - Written project reports due Dec 5,7,8,9 class/lab - Oral Presentations (20') (Dec 15 Thurs = Final Exam) 1 2 Nov 7 Bioinformatics Seminars Mon 12:10 IG Faculty Seminar in 101 Ind Ed II Inborn Errors of Metabolism in Humans & Animal Models Matt Ellinwood, Animal Science, ISU Nov 10 Thurs 3:40 Com S Seminar in 223 Atanasoff Computational Epidemiology Armin R. Mikler, Univ. North Texas http://www.cs.iastate.edu/~colloq/#t3 CORRECTION: Bioinformatics Seminars Next week - Baker Center/BCB Seminars: (seminar abstracts available at above link) Nov 14 Mon 1:10 PM Doug Brutlag, Stanford Discovering transcription factor binding sites Nov 15 Tues 1:10 PM Ilya Vakser, Univ Kansas Modeling protein-protein interactions both seminars will be in Howe Hall Auditorium 3 4 Mon Protein Structure & Function: Analysis & Prediction Protein structure: classification, databases, visualization Wed Protein structure: prediction & modeling Thurs Lab Protein structure prediction Fri Protein-nucleic acid interactions Protein-ligand docking Reading Assignment (for Mon-Fri) Mount Bioinformatics Chp 10 Protein classification & structure prediction http://www.bioinformaticsonline.org/ch/ch10/index.html pp. 409-491 Ck Errata: http://www.bioinformaticsonline.org/help/errata2.html Other? Additional reading assignments for BCB 544 5 6 D Dobbs ISU - BCB 444/544X 1

RNA structure prediction strategies Review last lecture: RNA Structure Prediction Algorithms Secondary structure prediction 1) Energy minimization (thermodynamics) 2) Comparative sequence analysis (co-variation) 3) Combined experimental & computational 7 8 1) Energy minimization method What are the assumptions? Native tertiary structure or "fold" of an RNA molecule is (one of) its lowest free energy configuration(s) Gibbs free energy = ΔG in kcal/mol at 37 C = equilibrium stability of structure lower values (negative) are more favorable Is this assumption valid? in vivo? - this may not hold, but we don't really know Gibbs free energy: ΔG Gibbs Free energy (G) is formally defined in terms of state functions enthalpy & entropy, & state variable, temperature G = H - TS ΔG = ΔH - TΔS (for constant temp) Enthalpy (H) = amount of heat absorbed by a system at constant pressure Entropy (S) = measure of the amount of disorder or randomness in a system Note = this is not the same as "entropy" in information theory, but is related, see: http://en.wikipedia.org/wiki/information_theory 9 10 Gibbs free energy: ΔG Gibbs free energy for formation of an RNA or protein structure = ΔG = equilibrium stability of that structure at a specific temperature (kcal/mol at 37 C) ΔG = -RT lnk eq Nearest-neighbor parameters Most methods for free energy minimization use nearest-neighbor parameters (derived from experiment) for predicting stability of an RNA secondary structure (in terms of ΔG at 37 C) & most available software packages use the same set of parameters: Mathews, Sabina, Zuker & Turner, 1999 R = gas constant 11 12 D Dobbs ISU - BCB 444/544X 2

Fig 6.3 Baxevanis & Ouellette 2005 Energy minimization - calculations: Total free energy of a specific conformation for a specific RNA molecule = sum of incremental energy terms for: helical stacking (sequence dependent) loop initiation unpaired stacking (favorable "increments" are < 0) 13 But how many possible conformations for a single RNA molecule? Huge number: Zuker estimates (1.8) N possible secondary structures for a sequence of N nucleotides for 100 nts (small RNA ) = 3 X 10 25 structures! Solution? Not exhaustive enumeration Dynamic programming O(N 3 ) in time O(N 2 ) in space/storage iff pseudoknots excluded, otherwise: O(N 6 ), time O(N 4 ), space 14 Algorithms based on energy minimization For outline of algorithm used in Mfold, including description of dynamic programming recursion, please visit Michael Zuker's lecture: http://www.bioinfo.rpi.edu/~zukerm/lectures/rnafold-html From this site, you may also download Zuker's lecture as either PDF or PS file. 2) Comparative sequence analysis (co-variation) Two basic approaches: Algorithms constrained by initial alignment Much faster, but not as robust as unconstrained Base-pairing probabilities determined by a partition function Algorithms not constrained by initial alignment Genetic algorithms often used for finding an alignment & set of structures 15 16 RNA structure prediction strategies Tertiary structure prediction Requires "craft" & significant user input & insight 1) Extensive comparative sequence analysis to predict tertiary contacts (co-variation) e.g., MANIP - Westhof 2) Use experimental data to constrain model building e.g., MC-CYM - Major 3) Homology modeling using sequence alignment & reference tertiary structure (not many of these!) 4) Low resolution molecular mechanics e.g., yammp - Harvey New Last Time: Protein Structure & Function 17 18 D Dobbs ISU - BCB 444/544X 3

Protein Structure & Function 4 Basic Levels of Protein Structure Protein structure - primarily determined by sequence Protein function - primarily determined by structure Globular proteins: compact hydrophobic core & hydrophilic surface Membrane proteins: special hydrophobic surfaces Folded proteins are only marginally stable Some proteins do not assume a stable "fold" until they bind to something = Intrinsically disordered Predicting protein structure and function can be very hard -- & fun! 19 20 Primary Primary & Secondary Structure Linear sequence of amino acids Description of covalent bonds linking aa s Secondary Local spatial arrangement of amino acids Description of short-range non-covalent interactions Periodic structural patterns: α-helix, β-sheet Tertiary Tertiary & Quaternary Structure Overall 3-D "fold" of a single polypeptide chain Spatial arrangement of 2 structural elements; packing of these into compact "domains" Description of long-range non-covalent interactions (plus disulfide bonds) Quaternary In proteins with > 1 polypeptide chain, spatial arrangement of subunits 21 22 "Additional" Structural Levels Super-secondary elements Motifs Domains Foldons New Today: Protein Structure & Function Amino acids characteristics Structural classes & motifs Protein functions & functional families not much - more on this later Classification Databases Visualization 23 24 D Dobbs ISU - BCB 444/544X 4

Amino Acids Peptide bond is rigid and planar Each of 20 different amino acids has different "R-Group," side chain attached to Cα 25 26 Hydrophobic Amino Acids Charged Amino Acids 27 28 Polar Amino Acids Certain side-chain configurations are energetically favored (rotamers) Ramachandran plot: "Allowable" psi & phi angles 29 30 D Dobbs ISU - BCB 444/544X 5

Glycine is smallest amino acid R group = H atom Glycine residues increase backbone flexibility because they have no R group Proline is cyclic Proline residues reduce flexibility of polypeptide chain Proline cis-trans isomerization is often a rate-limiting step in protein folding Recent work suggests it also may also regulate ligand binding in native proteins -Andreotti 31 32 Cysteines can form disulfide bonds Disulfide bonds (covalent) stabilize 3-D structures In eukaryotes, disulfide bonds are found only in secreted proteins or extracellular domains Globular proteins have a compact hydrophobic core Packing of hydrophobic side chains into interior is main driving force for folding Problem? Polypeptide backbone is highly polar (hydrophilic) due to polar -NH and C=O in each peptide unit; these polar groups must be neutralized Solution? Form regular secondary structures, e.g., α-helix, β-sheet, stabilized by H-bonds 33 34 Exterior surface of globular proteins is generally hydrophilic Hydrophobic core formed by packed secondary structural elements provides compact, stable core "Functional groups" of protein are attached to this framework; exterior has more flexible regions (loops) and polar/charged residues Protein Secondary Structures α Helix β Sheets Loops Coils Hydrophobic "patches" on protein surface are often involved in protein-protein interactions 35 36 D Dobbs ISU - BCB 444/544X 6

α - Helix Most abundant 2' structure in proteins Average length = 10 aa's (~10 Angstroms) Length varies from 5-40 aa's Alignment of H-bonds creates dipole moment (positive charge at NH end) Often at surface of core, with hydrophobic residues on inner-facing side, hydrophilic on other side α helix is stabilized by H-bonds between ~ every 4th residue C = black O = red N = blue 37 38 R-groups are on outside of α helix Types of α helices "Standard" α helix: 3.6 residues per turn H-bonds between C=0 of residue n and NH of residue n + 4 Helix ends are polar; almost always on surface of protein Other types of helices? n + 5 = π helix n + 3 = 3 10 helix 39 40 Certain amino acids are "preferred" & others are rare in α helices Ala, Glu, Leu, Met = good helix formers Pro, Gly Tyr, Ser = very poor Amino acid composition & distribution varies, depending on on location of helix in 3-D structure β-strands & Sheets H-bonds formed between 5-10 consecutive residues in one portion of chain with another set of 5-10 residues farther down chain Interacting regions may be adjacent (with short loop between) or far apart β-sheets usually have all strands either parallel or antiparallel 41 42 D Dobbs ISU - BCB 444/544X 7

Antiparallel β-sheet Antiparallel β-sheet 43 44 Parallel β-sheet Mixed β-sheets also occur 45 46 Loops Coils Connect helices and sheets Vary in length and 3-D configurations Are located on surface of structure Are more "tolerant" of mutations Are more flexible and can adopt multiple conformations Tend to have charged and polar amino acids Are frequently components of active sites Some fall into distinct structural families (e.g., hairpin loops, reverse turns) Regions of 2' structure that are not helices, sheets, or recognizable turns Intrinsically disordered regions appear to play important functional roles 47 48 D Dobbs ISU - BCB 444/544X 8

Globular proteins are built from recurring structural patterns Motifs or supersecondary structures = combinations of 2' structural elements Domains = combinations of motifs Independently folding unit (foldon) Functional unit A few common structural motifs Helix-turn-helix e.g., DNA binding Helix-loop-helix e.g., Calcium binding β-hairpin 2 adjacent antiparallel strands connected by short loop Greek key 4 adjacent antiparallel strands β α β 2 parallel strands connected by helix 49 50 H-T-H H-L-H β-hairpin 51 52 Greek key Beta-alpha-beta 53 54 D Dobbs ISU - BCB 444/544X 9

Simple motifs combine to form domains Large polypeptide chains fold into several domains 55 56 6 main classes of protein structure 1) α Domains 2) β Domains 3) α/β Domains Bundles of helices connected by loops Mainly antiparallel sheets, usually with 2 sheets forming sandwich Mainly parallel sheets with intervening helices, also mixed sheets 4) α+β Domains Mainly segregated helices and sheets 5) Multidomain (α & β) Containing domains from more than one class 6) Membrane & cell-surface proteins α-domain structures: coiled-coils 57 58 α-domain structures: 4-helix bundles All-α proteins: Globins 59 60 D Dobbs ISU - BCB 444/544X 10

β-domain structures Up-and-down sheets and barrel Anti-parallel β structures Functionally most diverse Includes: Up-and-down sheets or barrels Propeller-like structures Jelly roll barrels (from Greek key motifs) 61 62 Up-and-down sheets can form propeller-like structures Greek key motifs can form jelly roll barrels 63 64 3 main classes α/β-domain structures TIM barrel = Core of twisted parallel strands close together Rossman fold = open twisted sheet surrounded by helices on both sides Leucine-rich motif = specific pattern of Leu residues, strands form a curved sheet with helices on outside TIM barrel Rossman fold 65 66 D Dobbs ISU - BCB 444/544X 11

Leucine rich motifs can form α/β horseshoes Protein structure databases, structural classification & visualization PDB = Protein Data Bank http://www.rcsb.org/pdb/ (RISC) - several different structure viewers MMDB = Molecular Modeling Database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=structure (NCBI Entrez) - Cn3D viewer SCOP = Structural Classification of Proteins Levels reflect both evolutionary and structural relationships CATH = Classification by Class, Architecture, Topology and Homology 67 68 D Dobbs ISU - BCB 444/544X 12