Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics

Similar documents
Proteins: Characteristics and Properties of Amino Acids

Properties of amino acids in proteins

Protein Structure Bioinformatics Introduction

Using Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

PROTEIN STRUCTURE AMINO ACIDS H R. Zwitterion (dipolar ion) CO 2 H. PEPTIDES Formal reactions showing formation of peptide bond by dehydration:

The Structure of Enzymes!

The Structure of Enzymes!

Read more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Physiochemical Properties of Residues

Section Week 3. Junaid Malek, M.D.

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

Supersecondary Structures (structural motifs)

Chemistry Chapter 22

1. Amino Acids and Peptides Structures and Properties

Protein Struktur (optional, flexible)

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions

The Structure and Functions of Proteins

Translation. A ribosome, mrna, and trna.

Basic Principles of Protein Structures

Amino Acids and Peptides

Protein Structure Marianne Øksnes Dalheim, PhD candidate Biopolymers, TBT4135, Autumn 2013

Viewing and Analyzing Proteins, Ligands and their Complexes 2

Model Mélange. Physical Models of Peptides and Proteins

CHAPTER 29 HW: AMINO ACIDS + PROTEINS

Enzyme Catalysis & Biotechnology

From Amino Acids to Proteins - in 4 Easy Steps

Exam I Answer Key: Summer 2006, Semester C

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Problem Set 1

Protein Structure and Function. Protein Architecture:

BIRKBECK COLLEGE (University of London)

UNIT TWELVE. a, I _,o "' I I I. I I.P. l'o. H-c-c. I ~o I ~ I / H HI oh H...- I II I II 'oh. HO\HO~ I "-oh

Peptides And Proteins

Introduction to" Protein Structure

BCH 4053 Exam I Review Spring 2017

Biomolecules: lecture 10

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Solutions In each case, the chirality center has the R configuration

Secondary and sidechain structures

Biochemistry Quiz Review 1I. 1. Of the 20 standard amino acids, only is not optically active. The reason is that its side chain.

Exam III. Please read through each question carefully, and make sure you provide all of the requested information.

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Protein Struktur. Biologen und Chemiker dürfen mit Handys spielen (leise) go home, go to sleep. wake up at slide 39

Central Dogma. modifications genome transcriptome proteome

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

1. What is an ångstrom unit, and why is it used to describe molecular structures?

B O C 4 H 2 O O. NOTE: The reaction proceeds with a carbonium ion stabilized on the C 1 of sugar A.

Overview. The peptide bond. Page 1

Dana Alsulaibi. Jaleel G.Sweis. Mamoon Ahram

Lecture 15: Realities of Genome Assembly Protein Sequencing

Dental Biochemistry Exam The total number of unique tripeptides that can be produced using all of the common 20 amino acids is

Packing of Secondary Structures

NH 2. Biochemistry I, Fall Term Sept 9, Lecture 5: Amino Acids & Peptides Assigned reading in Campbell: Chapter

Biomolecules: lecture 9

Tamer Barakat. Razi Kittaneh. Mohammed Bio. Diala Abu-Hassan

Announcements. Primary (1 ) Structure. Lecture 7 & 8: PROTEIN ARCHITECTURE IV: Tertiary and Quaternary Structure

Chemical Properties of Amino Acids

Details of Protein Structure

THE UNIVERSITY OF MANITOBA. PAPER NO: _1_ LOCATION: 173 Robert Schultz Theatre PAGE NO: 1 of 5 DEPARTMENT & COURSE NO: CHEM / MBIO 2770 TIME: 1 HOUR

Lecture 10 (10/4/17) Lecture 10 (10/4/17)

Any protein that can be labelled by both procedures must be a transmembrane protein.

Practice Midterm Exam 200 points total 75 minutes Multiple Choice (3 pts each 30 pts total) Mark your answers in the space to the left:

Principles of Biochemistry

Protein Structure & Motifs

CHEM J-9 June 2014

D Dobbs ISU - BCB 444/544X 1

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV

Protein Structure Basics

BCH 4053 Spring 2003 Chapter 6 Lecture Notes

4. The Michaelis-Menten combined rate constant Km, is defined for the following kinetic mechanism as k 1 k 2 E + S ES E + P k -1

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds

Review of General & Organic Chemistry

LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor

ALL LECTURES IN SB Introduction

B. β Structure. All contents of this document, unless otherwise noted, are David C. & Jane S. Richardson. All Rights Reserved.

Basic structures of proteins

12/6/12. Dr. Sanjeeva Srivastava IIT Bombay. Primary Structure. Secondary Structure. Tertiary Structure. Quaternary Structure.

titin, has 35,213 amino acid residues (the human version of titin is smaller, with only 34,350 residues in the full length protein).

Dental Biochemistry EXAM I

7 Protein secondary structure

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27

8 Protein secondary structure

DATA MINING OF ELECTROSTATIC INTERACTIONS BETWEEN AMINO ACIDS IN COILED-COIL PROTEINS USING THE STABLE COIL ALGORITHM ANKUR S.

EXAM 1 Fall 2009 BCHS3304, SECTION # 21734, GENERAL BIOCHEMISTRY I Dr. Glen B Legge

Chapter 4: Amino Acids

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror

Biophysical Society On-line Textbook

Biotechnology of Proteins. The Source of Stability in Proteins (III) Fall 2015

DATE A DAtabase of TIM Barrel Enzymes

12 Protein secondary structure

Lecture 26: Polymers: DNA Packing and Protein folding 26.1 Problem Set 4 due today. Reading for Lectures 22 24: PKT Chapter 8 [ ].

Review. Membrane proteins. Membrane transport

BSc and MSc Degree Examinations

5) An amino acid that doesn't exist in proteins: - Tyrosine - Tryptophan - Cysteine - ornithine* 6) How many tripeptides can be formed from using one

4 Proteins: Structure, Function, Folding W. H. Freeman and Company

On the Structure Differences of Short Fragments and Amino Acids in Proteins with and without Disulfide Bonds

Heteropolymer. Mostly in regular secondary structure

Transcription:

Bioinformatics Protein Structure Principles & Architecture Marjolein Thunnissen Dep. of Biochemistry & Structural Biology Lund University September 2011 Homology, pattern and 3D structure searches need databases and database managing tools, search technique and dedicated tools for sequence and structure comparison and detection of similarity, for homology modelling etc. All this is the object of bioinformatics Why use bioinformatics? An explosive growth in the amount of biological information A more global perspective in experimental design. Data-mining. The potential for uncovering phylogenetic relationships and evolutionary patterns. Role of (bio)informatics in drug discovery Genome Gene Protein HTS Hit Lead Candidate drug Genomics Bioinformatics Structural bioinformatics Chemoinformatics Structure-Based Drug Design ADMET Modelling Structural bioinformatics techniques are valuable in areas from target identification to lead discovery

Why study 3D structures of biological macromolecules? Proteins are polymers Proteins are formed by a chain of repeating molecules. One such molecule is called an amino-acid. There are 20 types of amino-acids but they have all a common backbone or main-chain: 1. FUNCTION IS STRUCTURE! 2. Sequence homology is not enough to identify functional relationships. 3. Protein folding is still not fully understood. Predictions do not work satisfactory. The protein chain is formed by linking the amino-acids together. The linkage is called the peptide bond: 4. Drug design. Pharmaceutical industry The chain of amino-acids linked to each other by peptide bonds is also called: polypeptide chain. In DNA code: 20 different amino-acids. The 20 amino-acids: hydrophobic residues In proteins 20 different amino-acids are found. The names of the different aminoacids can be given as a 3 letter code or a 1 letter code: Alanine > Ala > A The amino-acids can be divided into sub-groups dependent on the nature of their side-chain. Group1 Hydrophobic Ala (A), Val (V), Leu (L), Ile (I), Phe (F), Pro (P) and Met (M) Group2 Charged Asp (D), Glu (E), Arg (R), Lys (K) Group3 Polar Ser (S), Thr (T), Cys (C), Asn (N), Gln (Q), His (H), Tyr (Y) and Trp (W) Group4 No special properties Gly (G) Alanine (Ala, A) Valine (Val, V) Proline (Pro, P) Leucine (Leu, L) Alternatively there is also a 5th group: Group 5 Aromatic rings Phe (F), Tyr (Y), Trp (W) and His (H) Isoleucine (Ile, I) Phenylalanine (Phe, F) Methionine (Met, M)

The 20 amino-acids: charged residues The 20 amino-acids: polar residues Aspartic acid (Asp, D) Glutamic acid (Glu, E) Serine (Ser, S) Threonine (Thr T) Tyrosine (Tyr, Y) Histidine (His, H) Arginine (Arg, R) Lysine (Lys, K) Cysteine (Cys, C) Asparagine (Asn, N) Glutamine (Gln, Q) Tryptophan (Trp, W) The 20 amino-acids: Glycine Structure in four dimensions Due to the fact that there are 20 different amino-acids, proteins are described in different dimensions. Primary Structure Secondary Structure Tertiary Structure Quaternary Structure Amino-acid sequence. Local regular structure: α-helices and β-sheets. Packing of secondary structure into one or several compact globular domains The overall if the protein exists out of several polypeptide chains. Glycine, (Gly, G)

Special properties of amino-acids Since there are 4 different groups attached to the central Cα atom of an amino-acid (except for Glycine), it is an asymmetric atom. Amino acids are therefore chiral molecules. There are two forms: L-form and D-form: Cysteines can form cross-links Cysteine residues from different parts of the sequence can link together in a disulfidebridge to form cross-links. The environment needs to be oxidative, within the cell the environment is reductive: cross-bridges are not often seen. Quite normal for extracellular proteins. The natural configuration of amino acids in proteins is always the L-form. These cross-links give extra stability to a protein structure. They can also link two polypeptide chains together. Properties of the peptide bond Phi-Psi angles The peptide bond unit containing the atoms C n, O n, N n+1 is a rigid plane with bond lengths and angles nearly the same for each of these units in a polypeptide chain. The rotation around N- Cα is called phi (φ) and the angle around Cα-C is called psi (ψ). Each amino acid is associated with these two conformational angles. If phi and psi for each residue is known: conformation of the whole backbone-chain is known since the peptide planes are so rigid. The freedom in conformation of this chain comes from rotating around the bonds between N n+1 - Cα n+1 and Cα n+1 -C n+1 Goto King Basic

Ramachandran plot Most combinations of φ/ψ are not allowed since they would cause steric collisions between side chains and main chain (kinemage). The φ/ψ pairs can be plotted against each other. Such a plot is called a Ramachandran plot. The residues will cluster in certain areas. These areas are called after the secondary structure the residues have. Glycine residues Glycine residues lack a side chain. Therefore they can have a much wide range of conformations then other residues. Glycines are used a lot to be able to have unusual main chain conformations (like a tight turn). Ramachandran plot of barnase Low and high energy conformation (allowed and disallowed): Forces holding proteins together Certain side chain conformations are energetically more favourable than others: these are more frequently seen in proteins. These conformers are called rotamers. Go to King Basic no 4 Electrostatic interactions Ionic interactions e.g. salt bridges Dipolar interactions dipole-dipole induced dipole Hydrogen bonds shared H-atom Hydrophobic packing mainly entropic Rotamers for Phe

Salt bridges and polar interactions Hydrogen bonds Ionic interactions occur either between fully charged groups (ionic), or between partially charged groups (dipole-dipole). The force of attraction between δ + and δ - decreases rapidly with distance. In the absence of water these interactions can be very strong. Hydrogen bonds occur when one hydrogen is shared between two atoms (mostly O and N atoms). One atom donates the hydrogen while the other accepts it. The hydrogen bond is the strongest when it is in a straight line. Examples from macro-molecules: In protein molecules ionic bonds occur between the charged residues. Combinations: Arg-Asp, Arg-Glu, Lys-Asp and Lys-Glu Dipole-Dipole interactions can occur eg. between Asn-Thr or Ser- Gln (many more combinations possible). Proteins DNA Hydrophobic interactions Secondary structure This is one major force in why proteins do fold. It is based on the fact that apolar and polar molecules do not like to mix, e.g. water-oil mixtures do not mix. The hydrophobic effect is really an entropy phenomenon. By clustering the hydrophobic molecules together there are less ordered water molecules. The main driving force behind protein folding is to pack hydrophobic residues into the interior of the protein thereby creating a hydrophobic core. Problem: the backbone of an amino acid contains some highly polar atoms: O and N. These atoms have to be neutralized Neutralization is achieved by formation of hydrogen bonds, the O is an acceptor, while the N is a donator. In proteins this means that the protein folds such that a core arises in which hydrophobic residues are buried. Secondary structure is an elegant way for the protein to bury the polar peptide bond in the protein interior. There are two types of secondary structure: alpha helices and beta sheets

Alpha (α) helices Hydrogen bonding pattern in an α-helix α-helixes are found in proteins when consecutive residues all have φ/ψ angles of approximately -60 and -50. This gives rise to helix formation. The α-helix is right-handed and has 3.6 residues per turn and there is a rise of 1.5Å per residue. In proteins α-helices are between 4 to 5 residues up to over 40 residues long with an average length of 10 residues (15Å). In the α-helix a very regular pattern of hydrogen bonds is formed. Hydrogen bonds are formed between the C=O of residue n and the NH of residue n +4. Therefore all these polar atoms are joined through hydrogen bonds. Exceptions are the NH atoms at the beginning of the helix and the O atoms at the end of the helix. The ends of the helix are polar and are found most often at the surface of the protein. Amphipatic α-helix Connecting helices: Helix-turn-helix motif A very common position for an α-helix is on the surface of the protein. This means that one side of the helix points towards the solution and the other side towards the hydrophobic core. There are 3.6 residues per turn: patterns arise where residues change from hydrophobic to hydrophilic every 3 to 4 residues. The helix is polar on one side and hydrophobic on the other: amphipatic. A way to look at sequences in an helix is to use an helical wheel representation : This is a projection of the residues on a plane perpendicular to the axis of the helix Goto King: Motif DNA-binding motif Ca-binding motif

The Ca-binding motif Residue conservation within the Ca 2+ - binding motif Four-helix bundle Examples of 4-helix bundles:

The globin domain Beta (β)-strands The second major type of secondary structure is β-sheets. In contrast with α- helices these are not built from continous stretches of sequence but from a combination of several regions of the polypeptide chain. These regions are between 5 to 10 residues long and the residues are in a full extended conformation with φ/ψ angles of around -135/135. This is called a β-strand. The β-strands are aligned adjacent to each other so that hydrogen bonds can be formed between the C=O groups from one strand and the N atoms from another strand. The sheets that are formed are pleated: Ca atoms are alternatively a little above and a little beneath the plane of the β-sheet. There are two alignments possible: parallel and anti-parallel. β-sheets parallel & antiparallel. A sheet is called parallel if the amino-acids in the strands run all in the same biochemical direction (amino-terminal to carboxyl-terminal). If the strands have an alternating pattern N --> C and then C--> N etc then it is an antiparallel sheet. Hydrogen bonding in β-sheets. The hydrogen bonding pattern is quite different between parallel and antiparallel sheets. In the antiparallel sheet there are narrowly spaced hydrogen bonds alternating with more widely spaced. The parallel sheet has more evenly spread hydrogen bonds.

Mixed β-sheets β-sheets can also have a mixed character: partially parallel and antiparallel: mixed β- sheets. These are the most common β-sheets in proteins. Goto King: Motifs Loops and turns Most proteins are built from several secondary structure elements which are linked to each other by loop regions. These loop regions differ in size and shape. The main chain C=O and N atoms are not interacting with each each other through hydrogen bonds, instead they are exposed. This is one reason that loops are often found on the surface of proteins so that these atoms can make hydrogen bonds with water molecules. Often charged and polar residues are used in loops. Some loops (especially in antiparallel β-sheets) are quite common: they are called hairpin loops. Almost all β-sheets ( whatever type) have their strands twisted: this twist has always the same handedness: right-handed twist. How to represent protein structures? In order to obtain most information from pictures about protein structure we need to simplify. We use schematic cartoons for doing that Topology diagram In order to have an overview of all the secondary structure elements and the order in which they appear in a protein, simple schematic drawings have been developed. These are called topology files. In these β-strands are represented by arrows and α- helices by cylinders.

β-sheet topology diagrams Tertiary structure: motifs Some simple combinations of secondary structure elements occur in many different proteins. These can exist out of e.g. two helices connected with a loop or two β- strands and a helix. These combinations have been called supersecondary structure or motifs. Some of these motifs have a particular function (e.g. DNA binding) but others seem to have no biological role but are used as building blocks. Antiparallel β-sheet in aspartate transcarbamoylase Parallel β-sheet in flavodoxin Antiparallel barrel in plastocyanin Greek Key motif The eight strands in γ-crystallin are arranged in two Greek key motifs This motif occurs in proteins with 4 adjacent anti-parallel β-strands. Since the topology file resembles an ornamental pattern used in ancient Greece it was called Greek Key. This motif is structural and no specific function is associated with it.

βαβ motif Adding βαβ motifs together: Two ways to join the units together, giving: For antiparallel β-sheets we can link the strands with small loops (quite often hairpins), however for parallel β-sheets we need longer loops or cross-over segments. These segments are frequently,made by α- helices. The whole unit looks the like β-strand - loop - α-helix - loop -β-strand. This is called the βαβ motif. The loops in this motif can differ in length (from only several residues to nearly 100) and contain more secondary structure elements. The element can have two hands (helix under strands or above) but the latter is much much more common. open twisted α β structure α β barrels Three main types of structure based on βαβ motifs The active site in all α/β barrels is in a pocket formed by the loop regions that connect the carboxy ends of the β strands with the adjacent a helices Closed barrel Triosephosphate isomerase Open twisted β-sheet Alcohol dehydrogenase Open barrel Ribonuclease inhibitor A view from the top of the barrel of the active site of the enzyme RuBisCo (ribulose bisphosphate carboxylase)

Motifs are used as building blocks. Motifs and secondary structure elements are used as a kind of Lego blocks to form 3- dimensional structures. If the resulting structure can fold independently it is called a domain. Large polypeptide chains fold into several domains Large polypeptide chains often fold into several domains. Often these domains are also units of function. E.g a DNA binding domain, a catalytic domain, an interaction domain etc. Certain domain folds are used in many different proteins. Lac-repressor: many motifs e.g. helix-loop-helix and 4 helix bundle Fatty acid binding protein: beta barrel + helix-loop-helix Classes of structures α-domain structures In general all proteins structures can be placed into three groups: Many different types of structures can be formed by α-helices alone. The first protein structures (myoglobin and hemoglobin) solved had only α-helices. Their fold is called globin-fold. all α-helical proteins all β-sheet proteins α/β proteins Hemoglobin

α-domain structures All β-structures The helices in an all-helical domain can be packed in almost parallel manner. This gives rise to two different types of packing: 4 helix bundles or large arrangements. All-β structures are predominantly antiparallel (no helices to make crossovers) and consist of packed sheets Up-and-down barrel Retinol binding protein Up-and-down sheet Propeller-like fold Influenza neuraminidase Superoxide dismutase (SOD) comprises eight antiparallel β-strands All β-structures (2) Jelly-roll barrel (2 x Greek key) Viral coat proteins β-helix Pectate lyase

α/β and α+β structures Protein structure universe These are the most common structures found. They consist of a central sheet (mixed or parallel) surrounded by α-helices (α/β) or segregated α and β regions. There are many variations in these classes (e.g. see how βαβ-units pack). Often the secondary structure elements provide structural strength while loops are involved in the function of the protein. α/β tyrosyl-trna transferase α+β Lysozyme Membrane proteins General topology Membrane proteins account for up to two thirds of known druggable targets. Especially receptors (G-coupled receptors, GPCR s) and ionchannels are important targets. Structural information still limited but growing. Four different ways in which protein molecules may be bound to a membrane. Membrane anchor by one transmembrane helix α-helical integral membrane protein β-sheet integral membrane protein Membrane anchored protein by amphiphilic heli

Hydropathy plots can be used for predicting transmembrane helices: First high resolution structural information for a membrane protein Plots for the polypeptide chains L and M of the reaction center The photosynthetic reaction center of a purple bacterium, nobel-price in 1988 GPCR: Rhodopsin Ion-channel: The potassium channel. First structural information by EM High resolution structures by X-ray diffraction 7 membrane spanning helices High resolution model Low resolution models Viewed perpendicular to the plane of the membrane The way the selectivity filter is formed: Main-chain atoms line the walls of this narrow passage with carbonyl oxygen atoms pointing into the pore, forming binding sites for K + ions.

Example all β integral membrane protein: porin Example membrane-anchored by amphipathic helices: cyclo-oxygenase 16 β-strands form an antiparallel β barrel that traverses the membrane. Molecular structure of a porin. Central channel allows passage of molecules across the membrane Important enzyme in the prostaglandin pathway. Aspirin targets this enzyme. If you would like to more more about protein structure: Anders Liljas et al, : Textbook of Structural Biology, ISBN-10: 9812772081 Greg Petsko & Dagmar Ringe: Protein Structure and Function Carl Brändén & John Tooze: Introduction to Protein Structure Philip Bourne & Helge Weissig: Structural Bioinformatics in particular chapter 22 & 23 http://kinemage.biochem.duke.edu/teaching/anatax/ index.html At LU: Department Of Biochemistry and Structural Biology.