RNA and Protein Structure Prediction

Size: px
Start display at page:

Download "RNA and Protein Structure Prediction"

Transcription

1 RNA and Protein Structure Prediction Bioinformatics: Issues and Algorithms CSE Spring 2007 Lecture 18-1-

2 Outline Multi-Dimensional Nature of Life RNA Secondary Structure Prediction Protein Structure Determination Protein Threading -2-

3 Life is not one-dimensional Secondary and tertiary structures for trna (tranfser RNA) Sample protein structures

4 RNA secondary structure: Base-pairing rules similar to basic sequence alignment. Unlike DNA, single-stranded RNA can fold back on itself. Recall that C-G and A-U form stable pairs ( Watson-Crick ). In addition, G-U forms a weaker pair ( wobble ) RNA secondary structure

5 Modeling RNA secondary structure Let R = r1r2... rn be an RNA sequence, where ri {A, C, G, U}. A secondary structure is a set of ordered pairs (i, j) such that: (1) j i > 4 pairing can't be too tight (2) if (i, j) and (i', j') are two base pairs, with i i', then either: (a) i = i' and j = j' i.e., same base pair or (b) i < j < i' < j' (i, j) precedes (i', j') or (c) i < i' < j' < j (i, j) encloses (i', j') This is also known as a folding

6 Visualizing RNA secondary structure Circular representation Computer predicted folding of Bacillus subtilis RNase P RNA H M = multi-loop I = interior loop B = bulge loop H = hairpin loop I base 0 I B I base 400 I H M H Dot plot representation H M H H M H M Hlower base pair precedes upper pair (i < j < i' < j') H M I H H this base pair encloses H all others (i < i' < j' < j ) place dot for each (i, j) pair

7 Predicting RNA secondary structure Premise: Base pairings have a stabilizing effect on a structure's free energy. Loops have a destabiling effect. Goal: to determine a secondary structure with minimum free energy. Hairpin loop Stacked base pairs Two popular approaches: Ignore loops, maximize number of base pairings (a bit simplistic, but leads to a nice algorithm). Employ loop-specific energy models (more realistic, but also more complex algorithms). -7-

8 An approach for predicting RNA secondary structure Assumption: Energy of each base pair is independent of all other base pairs and of loop structure. Consequence: Total free energy is sum of energies for all base pairs. Downside: Predictions only approximate reality. Key observation: We can use solutions for shorter subsequences to determine solution for longer sequences. Should sound familiar: precisely the kind of decoupling required for dynamic programming to work. -8-

9 Notation and definitions S(i, j) = secondary structure of RNA strand (i.e., set of base pairings) from base ri to base rj, inclusive. e(a, b) = free energy of base pair (a, b). E(S(i, j)) = free energy of secondary structure S(i, j). So we have: E S i, j = e a, b a, b S i, j Note: we can define e(a, b) to be a very large value when physical constraints are violated (e.g., b a 4, so that hairpin loop would be too tight). -9-

10 Algorithm for predicting RNA secondary structure Consider RNA strand from base position i to base position j: What is optimal folding for this? ri rk rj-1 rj Two possible cases depending on whether rj base-pairs: If rj does not base-pair, then E(S(i, j)) = E(S(i, j-1)). We already computed this. ri rk rj-1 rj If rj does pair, E(S(i, j)) = E(S(i, k-1)) + e(k, j) + E(S(k+1, j-1)). Already computed. ri Already computed. rk rj rj

11 Algorithm for predicting RNA secondary structure Optimize free energy over increasingly longer subsequences. For each partial folding S(i, j), free energy is given by: E S i, j = { min E S i 1, j 1 e i, j min {E S i, k 1 e k, j E S k 1, j 1 } for i k j Free energy of optimal folding for entire strand is E(S(1,n)). How do we deal with case where ri is not paired? Define e(k, j) to be 0 when k = j }

12 Time and space complexity What are time and space requirements of this algorithm? For each partial folding S(i, j), free energy is given by: E S i, j = { min E S i 1, j 1 e i, j min {E S i, k 1 e k, j E S k 1, j 1 } for i k j For RNA sequence of length n, we must be an n x n matrix. Each entry requires exploring an average of n/2 values for k. Hence, time complexity is O(n3), space complexity is O(n2) }

13 Summary of RNA secondary structure prediction #1 General summary of the situation: Tertiary structure is difficult to model and compute. Determining secondary structure is more amendable to a solution. While only an approximation, it gives good hints. No knots planar graph. Solved using dynamic programming in O(n3) time

14 Summary of RNA secondary structure prediction #2 Better results can be obtained by modeling loops. This problem is also solvable in O(n3) time using some tricks. Bulge Hairpin loop red = destabilizing ( bad ) green = stabilizing ( good ) Helical region Interior loop

15 Protein structure Primary structure of protein is determined by number and order of amino acids within polypeptide chain. Protein's secondary structure is defined as local conformation of its backbone, which consists of molecules that make up an amino acid's frame excluding side chains. Two common motifs include beta-pleated sheets and alpha helices. Tertiary structure is formed when attractions of side chains and secondary structure combine to form distinct 3-dimensional structure. This gives protein its specific function. Sometimes distinct proteins must combine to form correct 3-dimensional structure for a particular protein to function properly. E.g., hemoglobin is made of four similar proteins that combine to form its quaternary structure

16 Protein structure

17 Sequence structure function structure medicine sequence function

18 How is true 3-D structure determined? As of today, must be determined experimentally. Techniques include x-ray crystallography and NMR. diffraction pattern

19 X-ray crystallography A diffraction pattern: the white spots are the reflections

20 From electron density map to structure

21 Protein structure determination backbone... w/ side chains electron density map

22 Two common protein secondary structures Alpha Helix R groups of amino acids all extend to outside. Helix makes a complete turn every 3.6 amino acids. Helix is right-handed; it twists in clockwise direction. Carbonyl group (-C=O) of each peptide bond extends parallel to axis of helix and points directly at -N-H group of peptide bond 4 amino acids below it in helix. A hydrogen bond forms between them [-N-H O=C-]. Beta Conformation Consists of pairs of chains lying side-by-side and stabilized by hydrogen bonds between carbonyl oxygen atom on one chain and -NH group on adjacent chain. Chains are often "anti-parallel"; N-terminal to C-terminal direction of one being reverse of other

23 Alpha helix Alpha-helix (also written α-helix) is rod-like structure stabilized by hydrogen bonds between CO and NH groups of main chain. Ribbon representation of righthanded alpha-helix with only the alpha carbons represented. Examining backbone structure, note that alpha carbons spaced three and four in linear sequence are actually quite close together in helix structure. The hydrogen bonds are shown in green; all main chain CO and NH groups are hydrogen bonded. This structure is quite sturdy

24 Beta sheet

25 Protein structure Some proteins are made up of mostly alpha helicies. Both marine bloodworm hemoglobin (left) and E. coli cytochrome B562 (right) are composed of mostly alpha helicies. The 4 helix bundle of the cytochrome is a common motif. Some are mostly beta sheet. The green alga plastocyanin (left) and sea snake neurotoxin (right) are mostly beta sheets. Red = alpha helix Green = beta sheet Black = misc. loops

26 Protein structure Many proteins are a mix of alpha helicies and beta sheets. Two simple proteins with a mix of 2o components: ribonuclease T1 (left) and pancreatic trypsin inhibitor (right)

27 Protein Domains Tertiary structure of many proteins is built from several domains. Often each has a separate function to perform, such as: binding a small ligand (e.g., a peptide in the molecule shown here) spanning the plasma membrane (transmembrane proteins) containing the catalytic site (enzymes) DNA-binding (in transcription factors) providing a surface to bind specifically to another protein. In some cases, each domain is encoded by a separate exon in the gene. The histocompatibility molecule shown here has three domains: α1, α2, and α3 are each encoded by its own exon

28 Moving towards protein structure prediction... Most important information seems to be contained in alpha helices and beta sheets (which form core), not in loops. Given amino acid sequence, we want to determine locations of helices, sheets, and loops, and their arrangements. How to do this? Experimental techniques are expensive and time-consuming. Exhaustive enumeration at molecular level (taking structure with smallest free energy)? Nah... As of 2002, the NIH protein structure database contained approximately 15,000 entries. Hmm... Idea: given sequence, see if it could fit a known structure. This is known as protein threading

29 PDB new vs. old folding growth Old fold New fold Number of unique folds in nature is fairly small (possibly a few thousands). 90% of new structures submitted to PDB in the past three years have similar structural folds in PDB

30 Protein threading Somewhat similar to sequence alignment we studied earlier: homology modeling: align sequence to sequence, threading: align sequence to structure (templates)

31 The protein threading problem Given: new protein sequence, and library of templates: Find: best alignment of sequence to some template

32 The protein threading problem One possible threading (note non-local interactions):

33 The protein threading problem Input: 1. Protein sequence A with n amino acids ai. 2. Core structural model C, with m core segments Ci. Also: (a) Length ci of each core segment. (b) Core segments Ci and Ci+1 are connected by loop λi for which we know max (lmaxi) and min (lmini) lengths. (c) Structural environment for each amino acid position. 3. Scoring function f(t) to evaluate each threading T. Output: set of integers T = {t1, t2,..., tm} such that value of ti indicates which amino acid from A occupies first position in core segment i

34 Core templates with interactions Small circles represent amino acid positions. Thin lines indicate interactions represented in model

35 The protein threading problem Possible threadings: Unfortunately, due to variable-length gaps between core segments and non-local interactions, this problem is NP-hard. Fortunately, it is amenable to solution by a general-purpose optimization strategy known as branch-and-bound

36 Digression: branch-and-bound Basic idea: Partion solution space into distinct sets. Compute lower bound that applies to all solutions in given set. If we can find a solution that is better than this lower bound, we don't need to explore any solution in that set. Important note: branch and bound will find optimal solution (it's not heuristic)... but... it might take exponential time to do it. Still, it is often much faster than naive exhaustive search

37 Digression: branch-and-bound Let's see how this works for the traveling salesman problem, which we know is also NP-complete (protein threading is a bit too complicated for now). Given: a set of cities and costs to travel between them. Find: a minimum cost tour that visits each city once. A B 7 5 D C 2 5 E A B C D E A B C D E

38 Branch-and-bound Start search at A: Now let's try going from A to C: A A 2 12 B B C C 3 B +3=5 C =7 D 5 D E D + 4 = 11 E E + 2 = 13 No point in exploring any of these subtrees any further! Total cost for this tour is

39 Branch-and-bound In traveling salesman, we were able to eliminate from consideration all tours starting with: A-C-B-... and A-C-D-... and A-C-E-... because we knew they could never be optimal; we already had a tour with total cost less than their partial costs. General observation: complete solution with cost w partial solutions lower bound x w partial solutions lower bound y < w partial solutions lower bound z < y don't bother exploring this explore this if/when bound becomes best explore this first

40 Back to protein threading... Recall that ti indicates which amino acid occupies the first position in core segment i. Our scoring function is: f T = g 1 i, t i g 2 i, j, t i, t j i i j i As we know sizes of core segments and min and max lengths for loops, we can determine ranges for ti's. di-1 ti di di This will form basis for branchand-bound.

41 Branch-and-bound for protein threading Given a set of threadings T *, the optimization problem is: min f T = min g 1 i, t i g 2 i, j, t i, t j T T * T T * i i [ j i = min g 1 i, t i g 2 i, j, t i, t j T T * i j i ] What's a lower bound we can use? i [ min g 1 i, x min g 2 i, j, y, z bi x d i j i bi y d i b j z d j ] Note this is determined by the interval [bi, di] that ti may fall in

42 Branch-and-bound for protein threading Now we must split solution space into disjoint sets. Do this by selecting largest current interval for a ti and cutting it in half

43 Branch-and-bound for protein threading

44 CAFASP3 Example CAFASP: Critical Assessment of Fully Automated Structure Prediction CAFASP3 evaluated by MaxSub, a computer program. Predicted structures are superimposed to the experimental structures to see how long is superimposable. Red: Experimental Structure Blue: Correct Prediction Green: Incorrect Prediction

45 Wrap-up Readings for next time: "The Invention of the Genetic Code," Brian Hayes, American Scientist, vol. 86, no. 1, Jan. Feb., 1998, pp "Ode to the Code," Brian Hayes, American Scientist, vol. 92, no. 6, Nov. Dec., 2004, pp (Both papers are in the Readings folder on Blackboard.) Remember: Come to class having done the readings. Check Blackboard regularly for updates

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its

More information

Bio nformatics. Lecture 23. Saad Mneimneh

Bio nformatics. Lecture 23. Saad Mneimneh Bio nformatics Lecture 23 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely

More information

Protein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix)

Protein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix) Computat onal Biology Lecture 21 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely

More information

CAP 5510 Lecture 3 Protein Structures

CAP 5510 Lecture 3 Protein Structures CAP 5510 Lecture 3 Protein Structures Su-Shing Chen Bioinformatics CISE 8/19/2005 Su-Shing Chen, CISE 1 Protein Conformation 8/19/2005 Su-Shing Chen, CISE 2 Protein Conformational Structures Hydrophobicity

More information

Protein Structure Basics

Protein Structure Basics Protein Structure Basics Presented by Alison Fraser, Christine Lee, Pradhuman Jhala, Corban Rivera Importance of Proteins Muscle structure depends on protein-protein interactions Transport across membranes

More information

Basics of protein structure

Basics of protein structure Today: 1. Projects a. Requirements: i. Critical review of one paper ii. At least one computational result b. Noon, Dec. 3 rd written report and oral presentation are due; submit via email to bphys101@fas.harvard.edu

More information

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University General informations (1) Office hours: by appointment Office: TR3018

More information

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Introduction to Comparative Protein Modeling. Chapter 4 Part I Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature

More information

Model Mélange. Physical Models of Peptides and Proteins

Model Mélange. Physical Models of Peptides and Proteins Model Mélange Physical Models of Peptides and Proteins In the Model Mélange activity, you will visit four different stations each featuring a variety of different physical models of peptides or proteins.

More information

Biomolecules: lecture 10

Biomolecules: lecture 10 Biomolecules: lecture 10 - understanding in detail how protein 3D structures form - realize that protein molecules are not static wire models but instead dynamic, where in principle every atom moves (yet

More information

Introduction to" Protein Structure

Introduction to Protein Structure Introduction to" Protein Structure Function, evolution & experimental methods Thomas Blicher, Center for Biological Sequence Analysis Learning Objectives Outline the basic levels of protein structure.

More information

98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006

98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006 98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006 8.3.1 Simple energy minimization Maximizing the number of base pairs as described above does not lead to good structure predictions.

More information

Combinatorial approaches to RNA folding Part I: Basics

Combinatorial approaches to RNA folding Part I: Basics Combinatorial approaches to RNA folding Part I: Basics Matthew Macauley Department of Mathematical Sciences Clemson University http://www.math.clemson.edu/~macaule/ Math 4500, Spring 2015 M. Macauley (Clemson)

More information

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds Protein Structure Hierarchy of Protein Structure 2 3 Structural element Primary structure Secondary structure Super-secondary structure Domain Tertiary structure Quaternary structure Description amino

More information

From Amino Acids to Proteins - in 4 Easy Steps

From Amino Acids to Proteins - in 4 Easy Steps From Amino Acids to Proteins - in 4 Easy Steps Although protein structure appears to be overwhelmingly complex, you can provide your students with a basic understanding of how proteins fold by focusing

More information

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE Examples of Protein Modeling Protein Modeling Visualization Examination of an experimental structure to gain insight about a research question Dynamics To examine the dynamics of protein structures To

More information

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES Protein Structure W. M. Grogan, Ph.D. OBJECTIVES 1. Describe the structure and characteristic properties of typical proteins. 2. List and describe the four levels of structure found in proteins. 3. Relate

More information

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB Homology Modeling (Comparative Structure Modeling) Aims of Structural Genomics High-throughput 3D structure determination and analysis To determine or predict the 3D structures of all the proteins encoded

More information

What is the central dogma of biology?

What is the central dogma of biology? Bellringer What is the central dogma of biology? A. RNA DNA Protein B. DNA Protein Gene C. DNA Gene RNA D. DNA RNA Protein Review of DNA processes Replication (7.1) Transcription(7.2) Translation(7.3)

More information

D Dobbs ISU - BCB 444/544X 1

D Dobbs ISU - BCB 444/544X 1 11/7/05 Protein Structure: Classification, Databases, Visualization Announcements BCB 544 Projects - Important Dates: Nov 2 Wed noon - Project proposals due to David/Drena Nov 4 Fri PM - Approvals/responses

More information

Bi 8 Midterm Review. TAs: Sarah Cohen, Doo Young Lee, Erin Isaza, and Courtney Chen

Bi 8 Midterm Review. TAs: Sarah Cohen, Doo Young Lee, Erin Isaza, and Courtney Chen Bi 8 Midterm Review TAs: Sarah Cohen, Doo Young Lee, Erin Isaza, and Courtney Chen The Central Dogma Biology Fundamental! Prokaryotes and Eukaryotes Nucleic Acid Components Nucleic Acid Structure DNA Base

More information

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007 Molecular Modeling Prediction of Protein 3D Structure from Sequence Vimalkumar Velayudhan Jain Institute of Vocational and Advanced Studies May 21, 2007 Vimalkumar Velayudhan Molecular Modeling 1/23 Outline

More information

Orientational degeneracy in the presence of one alignment tensor.

Orientational degeneracy in the presence of one alignment tensor. Orientational degeneracy in the presence of one alignment tensor. Rotation about the x, y and z axes can be performed in the aligned mode of the program to examine the four degenerate orientations of two

More information

1/23/2012. Atoms. Atoms Atoms - Electron Shells. Chapter 2 Outline. Planetary Models of Elements Chemical Bonds

1/23/2012. Atoms. Atoms Atoms - Electron Shells. Chapter 2 Outline. Planetary Models of Elements Chemical Bonds Chapter 2 Outline Atoms Chemical Bonds Acids, Bases and the p Scale Organic Molecules Carbohydrates Lipids Proteins Nucleic Acids Are smallest units of the chemical elements Composed of protons, neutrons

More information

Dana Alsulaibi. Jaleel G.Sweis. Mamoon Ahram

Dana Alsulaibi. Jaleel G.Sweis. Mamoon Ahram 15 Dana Alsulaibi Jaleel G.Sweis Mamoon Ahram Revision of last lectures: Proteins have four levels of structures. Primary,secondary, tertiary and quaternary. Primary structure is the order of amino acids

More information

Physiochemical Properties of Residues

Physiochemical Properties of Residues Physiochemical Properties of Residues Various Sources C N Cα R Slide 1 Conformational Propensities Conformational Propensity is the frequency in which a residue adopts a given conformation (in a polypeptide)

More information

Motif Prediction in Amino Acid Interaction Networks

Motif Prediction in Amino Acid Interaction Networks Motif Prediction in Amino Acid Interaction Networks Omar GACI and Stefan BALEV Abstract In this paper we represent a protein as a graph where the vertices are amino acids and the edges are interactions

More information

Protein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1

Protein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1 Protein Structures Sequences of amino acid residues 20 different amino acids Primary Secondary Tertiary Quaternary 10/8/2002 Lecture 12 1 Angles φ and ψ in the polypeptide chain 10/8/2002 Lecture 12 2

More information

Conformational Geometry of Peptides and Proteins:

Conformational Geometry of Peptides and Proteins: Conformational Geometry of Peptides and Proteins: Before discussing secondary structure, it is important to appreciate the conformational plasticity of proteins. Each residue in a polypeptide has three

More information

Details of Protein Structure

Details of Protein Structure Details of Protein Structure Function, evolution & experimental methods Thomas Blicher, Center for Biological Sequence Analysis Anne Mølgaard, Kemisk Institut, Københavns Universitet Learning Objectives

More information

The Structure and Functions of Proteins

The Structure and Functions of Proteins Wright State University CORE Scholar Computer Science and Engineering Faculty Publications Computer Science and Engineering 2003 The Structure and Functions of Proteins Dan E. Krane Wright State University

More information

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions BIRKBECK COLLEGE (University of London) Advanced Certificate in Principles in Protein Structure MSc Structural Molecular Biology Date: Thursday, 1st September 2011 Time: 3 hours You will be given a start

More information

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror

Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror Please interrupt if you have questions, and especially if you re confused! Assignment

More information

Chapter

Chapter Chapter 17 17.4-17.6 Molecular Components of Translation A cell interprets a genetic message and builds a polypeptide The message is a series of codons on mrna The interpreter is called transfer (trna)

More information

Analysis and Prediction of Protein Structure (I)

Analysis and Prediction of Protein Structure (I) Analysis and Prediction of Protein Structure (I) Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 2006 Free for academic use. Copyright @ Jianlin Cheng

More information

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron. Protein Dynamics The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron. Below is myoglobin hydrated with 350 water molecules. Only a small

More information

RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17

RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 Dr. Stefan Simm, 01.11.2016 simm@bio.uni-frankfurt.de RNA secondary structures a. hairpin loop b. stem c. bulge loop d. interior loop e. multi

More information

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748 CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/15/07 CAP5510 1 EM Algorithm Goal: Find θ, Z that maximize Pr

More information

1. (5) Draw a diagram of an isomeric molecule to demonstrate a structural, geometric, and an enantiomer organization.

1. (5) Draw a diagram of an isomeric molecule to demonstrate a structural, geometric, and an enantiomer organization. Organic Chemistry Assignment Score. Name Sec.. Date. Working by yourself or in a group, answer the following questions about the Organic Chemistry material. This assignment is worth 35 points with the

More information

Enzyme Catalysis & Biotechnology

Enzyme Catalysis & Biotechnology L28-1 Enzyme Catalysis & Biotechnology Bovine Pancreatic RNase A Biochemistry, Life, and all that L28-2 A brief word about biochemistry traditionally, chemical engineers used organic and inorganic chemistry

More information

MULTIPLE CHOICE. Circle the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Circle the one alternative that best completes the statement or answers the question. Summer Work Quiz - Molecules and Chemistry Name MULTIPLE CHOICE. Circle the one alternative that best completes the statement or answers the question. 1) The four most common elements in living organisms

More information

1. What is an ångstrom unit, and why is it used to describe molecular structures?

1. What is an ångstrom unit, and why is it used to describe molecular structures? 1. What is an ångstrom unit, and why is it used to describe molecular structures? The ångstrom unit is a unit of distance suitable for measuring atomic scale objects. 1 ångstrom (Å) = 1 10-10 m. The diameter

More information

Protein Threading. BMI/CS 776 Colin Dewey Spring 2015

Protein Threading. BMI/CS 776  Colin Dewey Spring 2015 Protein Threading BMI/CS 776 www.biostat.wisc.edu/bmi776/ Colin Dewey cdewey@biostat.wisc.edu Spring 2015 Goals for Lecture the key concepts to understand are the following the threading prediction task

More information

Ch 3: Chemistry of Life. Chemistry Water Macromolecules Enzymes

Ch 3: Chemistry of Life. Chemistry Water Macromolecules Enzymes Ch 3: Chemistry of Life Chemistry Water Macromolecules Enzymes Chemistry Atom = smallest unit of matter that cannot be broken down by chemical means Element = substances that have similar properties and

More information

BCH 4053 Spring 2003 Chapter 6 Lecture Notes

BCH 4053 Spring 2003 Chapter 6 Lecture Notes BCH 4053 Spring 2003 Chapter 6 Lecture Notes 1 CHAPTER 6 Proteins: Secondary, Tertiary, and Quaternary Structure 2 Levels of Protein Structure Primary (sequence) Secondary (ordered structure along peptide

More information

Structure-Based Comparison of Biomolecules

Structure-Based Comparison of Biomolecules Structure-Based Comparison of Biomolecules Benedikt Christoph Wolters Seminar Bioinformatics Algorithms RWTH AACHEN 07/17/2015 Outline 1 Introduction and Motivation Protein Structure Hierarchy Protein

More information

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins Margaret Daugherty Fall 2004 Outline Four levels of structure are used to describe proteins; Alpha helices and beta sheets

More information

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.

Copyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years. Structure Determination and Sequence Analysis The vast majority of the experimentally determined three-dimensional protein structures have been solved by one of two methods: X-ray diffraction and Nuclear

More information

Packing of Secondary Structures

Packing of Secondary Structures 7.88 Lecture Notes - 4 7.24/7.88J/5.48J The Protein Folding and Human Disease Professor Gossard Retrieving, Viewing Protein Structures from the Protein Data Base Helix helix packing Packing of Secondary

More information

Supersecondary Structures (structural motifs)

Supersecondary Structures (structural motifs) Supersecondary Structures (structural motifs) Various Sources Slide 1 Supersecondary Structures (Motifs) Supersecondary Structures (Motifs): : Combinations of secondary structures in specific geometric

More information

From gene to protein. Premedical biology

From gene to protein. Premedical biology From gene to protein Premedical biology Central dogma of Biology, Molecular Biology, Genetics transcription replication reverse transcription translation DNA RNA Protein RNA chemically similar to DNA,

More information

Protein Structures: Experiments and Modeling. Patrice Koehl

Protein Structures: Experiments and Modeling. Patrice Koehl Protein Structures: Experiments and Modeling Patrice Koehl Structural Bioinformatics: Proteins Proteins: Sources of Structure Information Proteins: Homology Modeling Proteins: Ab initio prediction Proteins:

More information

ALL LECTURES IN SB Introduction

ALL LECTURES IN SB Introduction 1. Introduction 2. Molecular Architecture I 3. Molecular Architecture II 4. Molecular Simulation I 5. Molecular Simulation II 6. Bioinformatics I 7. Bioinformatics II 8. Prediction I 9. Prediction II ALL

More information

Objective: Students will be able identify peptide bonds in proteins and describe the overall reaction between amino acids that create peptide bonds.

Objective: Students will be able identify peptide bonds in proteins and describe the overall reaction between amino acids that create peptide bonds. Scott Seiple AP Biology Lesson Plan Lesson: Primary and Secondary Structure of Proteins Purpose:. To understand how amino acids can react to form peptides through peptide bonds.. Students will be able

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Bioinformatics. Macromolecular structure

Bioinformatics. Macromolecular structure Bioinformatics Macromolecular structure Contents Determination of protein structure Structure databases Secondary structure elements (SSE) Tertiary structure Structure analysis Structure alignment Domain

More information

Unit 1: Chemistry - Guided Notes

Unit 1: Chemistry - Guided Notes Scientific Method Notes: Unit 1: Chemistry - Guided Notes 1 Common Elements in Biology: Atoms are made up of: 1. 2. 3. In order to be stable, an atom of an element needs a full valence shell of electrons.

More information

Final Chem 4511/6501 Spring 2011 May 5, 2011 b Name

Final Chem 4511/6501 Spring 2011 May 5, 2011 b Name Key 1) [10 points] In RNA, G commonly forms a wobble pair with U. a) Draw a G-U wobble base pair, include riboses and 5 phosphates. b) Label the major groove and the minor groove. c) Label the atoms of

More information

CHEM 463: Advanced Inorganic Chemistry Modeling Metalloproteins for Structural Analysis

CHEM 463: Advanced Inorganic Chemistry Modeling Metalloproteins for Structural Analysis CHEM 463: Advanced Inorganic Chemistry Modeling Metalloproteins for Structural Analysis Purpose: The purpose of this laboratory is to introduce some of the basic visualization and modeling tools for viewing

More information

AP Biology. Proteins. AP Biology. Proteins. Multipurpose molecules

AP Biology. Proteins. AP Biology. Proteins. Multipurpose molecules Proteins Proteins Multipurpose molecules 2008-2009 1 Proteins Most structurally & functionally diverse group Function: involved in almost everything u enzymes (pepsin, DNA polymerase) u structure (keratin,

More information

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1 Zhou Pei-Yuan Centre for Applied Mathematics, Tsinghua University November 2013 F. Piazza Center for Molecular Biophysics and University of Orléans, France Selected topic in Physical Biology Lecture 1

More information

Read more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov

Read more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov 2018 Biochemistry 110 California Institute of Technology Lecture 2: Principles of Protein Structure Linus Pauling (1901-1994) began his studies at Caltech in 1922 and was directed by Arthur Amos oyes to

More information

Protein Structure and Function. Protein Architecture:

Protein Structure and Function. Protein Architecture: BCHS 6229 Protein Structure and Function Lecture 2 (October 13, 2011) Protein Architecture: Symmetry relationships and protein structure Primary & Secondary Structure Motifs & Super-secondary Structure

More information

proteins are the basic building blocks and active players in the cell, and

proteins are the basic building blocks and active players in the cell, and 12 RN Secondary Structure Sources for this lecture: R. Durbin, S. Eddy,. Krogh und. Mitchison, Biological sequence analysis, ambridge, 1998 J. Setubal & J. Meidanis, Introduction to computational molecular

More information

Algorithms in Computational Biology (236522) spring 2008 Lecture #1

Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: 15:30-16:30/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office hours:??

More information

Protein Structure Prediction Using Multiple Artificial Neural Network Classifier *

Protein Structure Prediction Using Multiple Artificial Neural Network Classifier * Protein Structure Prediction Using Multiple Artificial Neural Network Classifier * Hemashree Bordoloi and Kandarpa Kumar Sarma Abstract. Protein secondary structure prediction is the method of extracting

More information

BME 5742 Biosystems Modeling and Control

BME 5742 Biosystems Modeling and Control BME 5742 Biosystems Modeling and Control Lecture 24 Unregulated Gene Expression Model Dr. Zvi Roth (FAU) 1 The genetic material inside a cell, encoded in its DNA, governs the response of a cell to various

More information

Announcements. Primary (1 ) Structure. Lecture 7 & 8: PROTEIN ARCHITECTURE IV: Tertiary and Quaternary Structure

Announcements. Primary (1 ) Structure. Lecture 7 & 8: PROTEIN ARCHITECTURE IV: Tertiary and Quaternary Structure Announcements TA Office Hours: Brian Eckenroth Monday 3-4 pm Thursday 11 am-12 pm Lecture 7 & 8: PROTEIN ARCHITECTURE IV: Tertiary and Quaternary Structure Margaret Daugherty Fall 2003 Homework II posted

More information

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: m Eukaryotic mrna processing Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: Cap structure a modified guanine base is added to the 5 end. Poly-A tail

More information

Table 1. Crystallographic data collection, phasing and refinement statistics. Native Hg soaked Mn soaked 1 Mn soaked 2

Table 1. Crystallographic data collection, phasing and refinement statistics. Native Hg soaked Mn soaked 1 Mn soaked 2 Table 1. Crystallographic data collection, phasing and refinement statistics Native Hg soaked Mn soaked 1 Mn soaked 2 Data collection Space group P2 1 2 1 2 1 P2 1 2 1 2 1 P2 1 2 1 2 1 P2 1 2 1 2 1 Cell

More information

BA, BSc, and MSc Degree Examinations

BA, BSc, and MSc Degree Examinations Examination Candidate Number: Desk Number: BA, BSc, and MSc Degree Examinations 2017-8 Department : BIOLOGY Title of Exam: Molecular Biology and Biochemistry Part I Time Allowed: 1 hour and 30 minutes

More information

NMR, X-ray Diffraction, Protein Structure, and RasMol

NMR, X-ray Diffraction, Protein Structure, and RasMol NMR, X-ray Diffraction, Protein Structure, and RasMol Introduction So far we have been mostly concerned with the proteins themselves. The techniques (NMR or X-ray diffraction) used to determine a structure

More information

CS273: Algorithms for Structure Handout # 2 and Motion in Biology Stanford University Thursday, 1 April 2004

CS273: Algorithms for Structure Handout # 2 and Motion in Biology Stanford University Thursday, 1 April 2004 CS273: Algorithms for Structure Handout # 2 and Motion in Biology Stanford University Thursday, 1 April 2004 Lecture #2: 1 April 2004 Topics: Kinematics : Concepts and Results Kinematics of Ligands and

More information

BIRKBECK COLLEGE (University of London)

BIRKBECK COLLEGE (University of London) BIRKBECK COLLEGE (University of London) SCHOOL OF BIOLOGICAL SCIENCES M.Sc. EXAMINATION FOR INTERNAL STUDENTS ON: Postgraduate Certificate in Principles of Protein Structure MSc Structural Molecular Biology

More information

15.2 Prokaryotic Transcription *

15.2 Prokaryotic Transcription * OpenStax-CNX module: m52697 1 15.2 Prokaryotic Transcription * Shannon McDermott Based on Prokaryotic Transcription by OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons

More information

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB)

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB) Protein structure databases; visualization; and classifications 1. Introduction to Protein Data Bank (PDB) 2. Free graphic software for 3D structure visualization 3. Hierarchical classification of protein

More information

RNA & PROTEIN SYNTHESIS. Making Proteins Using Directions From DNA

RNA & PROTEIN SYNTHESIS. Making Proteins Using Directions From DNA RNA & PROTEIN SYNTHESIS Making Proteins Using Directions From DNA RNA & Protein Synthesis v Nitrogenous bases in DNA contain information that directs protein synthesis v DNA remains in nucleus v in order

More information

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy Burkhard Rost and Chris Sander By Kalyan C. Gopavarapu 1 Presentation Outline Major Terminology Problem Method

More information

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins Margaret Daugherty Fall 2003 Outline Four levels of structure are used to describe proteins; Alpha helices and beta sheets

More information

LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor

LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor Note: Adequate space is given for each answer. Questions that require a brief explanation should

More information

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Design of a Novel Globular Protein Fold with Atomic-Level Accuracy Brian Kuhlman, Gautam Dantas, Gregory C. Ireton, Gabriele Varani, Barry L. Stoddard, David Baker Presented by Kate Stafford 4 May 05 Protein

More information

Motivating the need for optimal sequence alignments...

Motivating the need for optimal sequence alignments... 1 Motivating the need for optimal sequence alignments... 2 3 Note that this actually combines two objectives of optimal sequence alignments: (i) use the score of the alignment o infer homology; (ii) use

More information

Types of RNA. 1. Messenger RNA(mRNA): 1. Represents only 5% of the total RNA in the cell.

Types of RNA. 1. Messenger RNA(mRNA): 1. Represents only 5% of the total RNA in the cell. RNAs L.Os. Know the different types of RNA & their relative concentration Know the structure of each RNA Understand their functions Know their locations in the cell Understand the differences between prokaryotic

More information

Properties of amino acids in proteins

Properties of amino acids in proteins Properties of amino acids in proteins one of the primary roles of DNA (but not the only one!) is to code for proteins A typical bacterium builds thousands types of proteins, all from ~20 amino acids repeated

More information

Protein Secondary Structure Prediction

Protein Secondary Structure Prediction Protein Secondary Structure Prediction Doug Brutlag & Scott C. Schmidler Overview Goals and problem definition Existing approaches Classic methods Recent successful approaches Evaluating prediction algorithms

More information

Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics

Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics Bioinformatics Protein Structure Principles & Architecture Marjolein Thunnissen Dep. of Biochemistry & Structural Biology Lund University September 2011 Homology, pattern and 3D structure searches need

More information

Biochemistry Quiz Review 1I. 1. Of the 20 standard amino acids, only is not optically active. The reason is that its side chain.

Biochemistry Quiz Review 1I. 1. Of the 20 standard amino acids, only is not optically active. The reason is that its side chain. Biochemistry Quiz Review 1I A general note: Short answer questions are just that, short. Writing a paragraph filled with every term you can remember from class won t improve your answer just answer clearly,

More information

Protein Structure & Motifs

Protein Structure & Motifs & Motifs Biochemistry 201 Molecular Biology January 12, 2000 Doug Brutlag Introduction Proteins are more flexible than nucleic acids in structure because of both the larger number of types of residues

More information

Introduction to Protein Folding

Introduction to Protein Folding Introduction to Protein Folding Chapter 4 Proteins: Three Dimensional Structure and Function Conformation - three dimensional shape Native conformation - each protein folds into a single stable shape (physiological

More information

Review. Membrane proteins. Membrane transport

Review. Membrane proteins. Membrane transport Quiz 1 For problem set 11 Q1, you need the equation for the average lateral distance transversed (s) of a molecule in the membrane with respect to the diffusion constant (D) and time (t). s = (4 D t) 1/2

More information

Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models

Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models Last time Domains Hidden Markov Models Today Secondary structure Transmembrane proteins Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL

More information

THE UNIVERSITY OF MANITOBA. PAPER NO: 409 LOCATION: Fr. Kennedy Gold Gym PAGE NO: 1 of 6 DEPARTMENT & COURSE NO: CHEM 4630 TIME: 3 HOURS

THE UNIVERSITY OF MANITOBA. PAPER NO: 409 LOCATION: Fr. Kennedy Gold Gym PAGE NO: 1 of 6 DEPARTMENT & COURSE NO: CHEM 4630 TIME: 3 HOURS PAPER NO: 409 LOCATION: Fr. Kennedy Gold Gym PAGE NO: 1 of 6 DEPARTMENT & COURSE NO: CHEM 4630 TIME: 3 HOURS EXAMINATION: Biochemistry of Proteins EXAMINER: J. O'Neil Section 1: You must answer all of

More information

NGF - twenty years a-growing

NGF - twenty years a-growing NGF - twenty years a-growing A molecule vital to brain growth It is twenty years since the structure of nerve growth factor (NGF) was determined [ref. 1]. This molecule is more than 'quite interesting'

More information

Getting To Know Your Protein

Getting To Know Your Protein Getting To Know Your Protein Comparative Protein Analysis: Part III. Protein Structure Prediction and Comparison Robert Latek, PhD Sr. Bioinformatics Scientist Whitehead Institute for Biomedical Research

More information

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur Lecture - 06 Protein Structure IV We complete our discussion on Protein Structures today. And just to recap

More information

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism. Proteins are linear polypeptide chains (one or more) Building blocks: 20 types of amino acids. Range from a few 10s-1000s They fold into varying three-dimensional shapes structure medicine Certain level

More information

Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure

Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure Last time Today Domains Hidden Markov Models Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL SSLGPVVDAHPEYEEVALLERMVIPERVIE FRVPWEDDNGKVHVNTGYRVQFNGAIGPYK

More information

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder HMM applications Applications of HMMs Gene finding Pairwise alignment (pair HMMs) Characterizing protein families (profile HMMs) Predicting membrane proteins, and membrane protein topology Gene finding

More information

Protein Structures. 11/19/2002 Lecture 24 1

Protein Structures. 11/19/2002 Lecture 24 1 Protein Structures 11/19/2002 Lecture 24 1 All 3 figures are cartoons of an amino acid residue. 11/19/2002 Lecture 24 2 Peptide bonds in chains of residues 11/19/2002 Lecture 24 3 Angles φ and ψ in the

More information

BIOCHEMISTRY Unit 2 Part 4 ACTIVITY #6 (Chapter 5) PROTEINS

BIOCHEMISTRY Unit 2 Part 4 ACTIVITY #6 (Chapter 5) PROTEINS BIOLOGY BIOCHEMISTRY Unit 2 Part 4 ACTIVITY #6 (Chapter 5) NAME NAME PERIOD PROTEINS GENERAL CHARACTERISTICS AND IMPORTANCES: Polymers of amino acids Each has unique 3-D shape Vary in sequence of amino

More information