A tutorial on RNA folding methods and resources

Size: px
Start display at page:

Download "A tutorial on RNA folding methods and resources"

Transcription

1 A tutorial on RNA folding methods and resources Alain Denise, LRI/IGM, Université Paris-Sud with invaluable help from Yann Ponty, CNRS/Ecole Polytechnique 1 Master BIBS

2 Goals To help your work your way through the RNA data jungle. To introduce mature structure prediction/annotation tools and algorithms. Locate structural data Energy minimization Boltzmann Ensemble Pseudoknots Structural annotation Comparative methods 2 Master BIBS

3 RNA structure(s) 3 Master BIBS

4 RNA structure(s) 4 Master BIBS

5 How RNA folds G/C U/A U/G Canonical base-pairs 5s rrna (PDB ID: 1UN6) RNA folding = Hierarchical stochastic process driven by/resulting in the pairing (hydrogen bonds) of a subset of its bases. 5 Master BIBS

6 Secondary Structure representations 6 Master BIBS

7 Exercise Download the Java applet Varna ( [choose the Binaries file] Run it. See the two example structures, see how to switch from one to another. Put the following new sequence and structure: AAGGGCTTAGCTTAATTAAAGTAGTTGATTTGCATTCAGCAGCTGTAGGATAAAGTCTTGCAGTCCTTA (((((((..((((...)))).(((((...)))))...(((((...)))))))))))). Try the different representations (click right, Redraw, Algorithm) 7 A.Denise Y. Ponty M2 BIBS

8 Sources of RNA structural data Name Data type Scope Description File formats #Entries URL PDB All-atoms General RCSB Protein Data Bank Global repository for 3D molecular models PDB ~1,900 models NDB All-atoms, Secondary structures General Nucleic Acids Database Nucleic acids models and structural annotations. PDB, RNAML ~2,000 models RFAM Alignments, Secondary structures 3 General RNA FAMilies Multiple alignments of RNA as functional families. Features consensus secondary structures, either predicted and/or manually curated. STOCKHOLM, FASTA ~1,973 Alignments/ structures, 2,756,313 sequences STRAND Secondary structures General The RNA secondary STRucture and statistical ANalysis Database Curated aggregation of several databases CT, BPSEQ, RNAML, FASTA, Vienna 4,666 structures PseudoBase Secondary structures Pseudokn otted RNAs PseudoBase Secondary structure of known pseudonotted RNAs. Extended Vienna RNA 359 structures CRW Sequence alignments, Secondary structures Ribosoma l RNAs, Introns Comparative RNA Web Site Manually curated alignments and statistics of ribosomal RNAs. FASTA, ALN, BPSEQ 1,109 structures, 91,877 sequences 8 Master BIBS

9 Basic prediction Minimal free-energy folding 9 Master BIBS

10 Minimal Free-Energy (MFE) Folding Goal: Predict the functional (aka native) conformation of an RNA Hypothesis : it folds into a minimal free energy configuration Turner model associates free-energies to secondary structures The model is additive : the global energy is the sum of energies of secondary structure elements : basepair stackings terminal loops internal loops bulges Biological sequence analysis Durbin, Eddy, Krogh, Mitchison Cambridge Univ. Press Master BIBS

11 Minimal Free-Energy (MFE) Folding Several softwares do this work, notably RNAFold and Mfold. Most of them only consider secondary structures without pseudoknots. Almost all of them are based on a very powerful approach in algorithmics : dynamic programming. CAGUAGCCGAUCGCAGCUAGCGUA RNAFold, MFold 11 Master BIBS

12 Minimal Free-Energy (MFE) Folding So the folding problem is, given an energy function : Data : a sequence, Output : the folding which has the minimum free energy CAGUAGCCGAUCGCAGCUAGCGUA RNAFold, MFold 12 Master BIBS

13 Exercise In Rfam ( search for the microrna mir-263 family Once on the page of the family, click on alignments in the left menu. Ask to view the Seed alignment in FASTA (Ungapped format) : this can be done within the second group of choices : «Formatting options» Take the first two sequences: T.castaneum and D.virilis Fold them with RNAfold ( With Varna, draw and compare the two structures 13 Master BIBS

14 Exercise: some not easy cases Here are three trna sequences with their true structures. Fold them with RNAfold, and compare each of the predicted structure with the corresponding true one. Is it all right? >Artibeus jamaicensis, True Structure, trna Alanine AAGGGCTTAGCTTAATTAAAGTAGTTGATTTGCATTCAGCAGCTGTAGGATAAAGTCTTGCAGTCCTTA (((((((..((((...)))).(((((...)))))...(((((...)))))))))))). >Balaenoptera musculus, True Structure, trna Alanine GAGGATTTAGCTTAATTAAAGTGTTTGATTTGCATTCAATTGATGTAAGATATAGTCTTGCAGTCCTTA (((((((..((((...))))..((((...))))...(((((...)))))))))))). >Bos taurus, True Structure, trna Alanine GAGGATTTAGCTTAATTAAAGTGGTTGATTTGCATTCAATTGATGTAAGGTGTAGTCTTGCAATCCTTA (((((((..((((...)))).(((((...)))))...(((((...)))))))))))). 14 Master BIBS

15 Dynamic programming explained on a simpler (and unrealistic) energy model The first dynamic programming algorithm for RNA folding (Nussinov, Jacobson 1978) considered only the number of basepairs for the energy function: Data : a sequence Output : the secondary structure (without pseudonots) that has the largest number of basepairs. Very unrelevant energy model! Meanwhile this algorithm led to the best current folding methods (with much better energy functions) because it uses dynamic programming. 15 Master BIBS

16 Dynamic programming explained on a simpler (and unrealistic) energy model Dynamic programing is a very general optimization method, it is used for a lot of applications in computer science. The first step is to find a recurrence relation that allows to construct the objects of interest. Notation : γ(i,j) = number of basepairs in the best structure between bases i and j. δ(i,j) = 1 if bases i et j are complementary, 0 otherwise. 16 Master BIBS

17 Nussinov s algorithm (1978) Let i and j be two positions. Suppose that I know all the best possible structures within any interval located between i and j, but not equal to the interval (i,j) i+1 j-1 1 i n j How can I construct the best possible structure between i and j? 17 Master BIBS

18 Nussinov s algorithm (1978) i i+1 1. j i 2. j-1 j 3. i+1 j-1 i j i k k+1 j 4. How can I construct the best possible structure between i and j? There are 4 ways to do it. 18 Master BIBS

19 Nussinov s algorithm (1978) γ(i,j) = Max { i i+1 1. γ(i+1,j) j i j-1 2. γ(i,j-1) j i+1 j-1 i j 3. γ(i+1,j-1)+δ(i,j) i k k+1 j 4. Max i<k<j { γ(i,k) + γ(k+1,j)} } 19 Master BIBS

20 Recurrence relation γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } γ(i,i) = 0 for any i γ(i,j) = 0 if j<i 1 i j n [ ] With this recurrence, we can compute γ(1,n). the maximum number of basepairs in the whole sequence. We will construct the corresponding structure in a following step. 20 Master BIBS

21 C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 C C G G C A U G γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } 21 Master BIBS

22 C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 C C G G C A U G C 2 G 3 γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } 22 Master BIBS

23 C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 C C G G C A U G γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } 23 Master BIBS

24 C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 C C G G C A U G C 1 C 2 G 3 γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } 24 Master BIBS

25 C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 C C G G C A U G C 1 C 2 G 3 γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } 25 Master BIBS

26 C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 C C G G C A U G γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } 26 Master BIBS

27 C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 C C G G C A U G C 1 C 2 G 3 G 4 γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } 27 Master BIBS

28 C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 C C G G C A U G C 2 G 3 G 4 C 5 γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } 28 Master BIBS

29 C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 C C G G C A U G Exercise: complete the table. γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } 29 Master BIBS

30 C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 C C G G C A U G C 1 C 2 G 3 G 4 C 5 A 6 U 7 G 8 γ(i,j) = Max { γ(i+1,j) γ(i,j-1), γ(i+1,j-1)+δ(i,j) Max i<k<j { γ(i,k) + γ(k+1,j)} } 30 Master BIBS

31 Algorithmic complexity The complexity of an algorithm is a measure of its memory and time requirements, according to the size of the data. In many cases, it is given not precisely, but with an «order of magnitude». Here, according to n, the length of the sequence: Space complexity: O(n 2 ) Time complexity: O(n 3 ) 31 Master BIBS

32 A language-theoretical point of view γ(i,j) = Max { i i+1 1. γ(i+1,j) j i j-1 2. γ(i,j-1) j i+1 j-1 i j 3. γ(i+1,j-1)+δ(i,j) i k k+1 j 4. Max i<k<j { γ(i,k) + γ(k+1,j)} } 32 Master BIBS

33 A language-theoretical point of view Decomposition Context-free grammar S i i+1 NS j i SN j-1 j i+1 j-1 i j NSN i k k+1 j SS ε N a c g u } 33 Master BIBS

34 A language-theoretical point of view Decomposition Context-free grammar S NS SN NSN SS ε N a c g u S a N S c N S N g c N S N g N S a u N S N S g ε 34 Master BIBS

35 A language-theoretical point of view Decomposition Context-free grammar a S NS SN NSN SS ε N a c g u c c a N N S S S N N S N N u N S S g g The grammar can generate all possible sequences For any given sequence, it can generate all possible secondary structures: each derivation tree represents a structure. So the grammar has been made for being ambiguous The energy of a given structure can be computed with a system of attributes that are associated to the rules of the grammar. In fact, the Nussinov algorithm is a variant of CYK (see ITPP course). N S g ε 35 Master BIBS

36 Up-to-date algorithm: Zucker, Stiegler 1981 and its improvements The same general principle : dynamic programming. The same algorithmic complexity. But a much more realistic energy function. The parameters have been set by experimental measures (Turner, 1999, 2004). Softwares: RNAfold, UnaFold/mfold 36 Master BIBS

37 Probabilistic approaches in RNA folding RNA in silico paradigm shift: From single structure, minimal free-energy folding to ensemble approaches. CAGUAGCCGAUCGCAGCUAGCGUA UnaFold, RNAFold, Sfold Ensemble diversity? Structure likelihood? Evolutionary robustness? 37 Master BIBS

38 Probabilistic approaches indicate uncertainty and suggest alternative conformations Example: >ENA M10740 M Saccharomyces cerevisiae Phe-tRNA. : Location:1..76 GCGGATTTAGCTCAGTTGGGAGAGCGCCAGACTGAAGATTTGGAGGTCCTGTGTTCGATCCACAGAATTCGCACCA RNAFold -p Native structure «dot-plot» 38 Master BIBS

39 Prise en compte des liaisons non canoniques [Parisien, Major 2008] Séquence MC-Fold Structure secondaire (avec liaisons non canoniques) MC-Sym Structure 3D 39 Master BIBS

40 Prise en compte des liaisons non canoniques [Parisien, Major 2008] Modèle d énergie : décomposition de la structure similaire à Turner. Valeurs des paramètres estimés sur les structures connues (expérimentalement). Algorithme : heuristique basée en partie sur la programmation dynamique. 40 Master BIBS

41 Pseudoknots New practical tools (at last!) 41 Master BIBS

42 Pseudoknots Pseudoknots are complex topological models indicated by crossing interactions. Pseudoknots are largely ignored by computational prediction tools: Lack of accepted energy model Algorithmically challenging Yet heuristics can be sometimes efficient. Visualizing of secondary structure with pseudoknots is supported by: PseudoViewer VARNA 42 Master BIBS

43 Exercise Here is the native structure of a tmrna from the PseudoBase (ID: PKB210) CCGCUGCACUGAUCUGUCCUUGGGUCAGGCGGGGGAAGGCAACUUCCCAGGGGGCAACCCCGAACCGCAGCAGCGACAUUCACAAGGAAU :((((((::(((:::[[[[[[[::))):((((((((((::::)))))):((((::::)))):::)))):)))))):::::::]]]]]]]: Fold this sequence using RNAFold and compare the result to the native structure Fold this sequence using Pknots-RG (Program type: Enforcing PK) 43 Master BIBS

44 Predicting pseudoknots True structure pknotsrg RNAfold 44 Master BIBS

45 Predicting pseudoknots In fact, the general problem of predicting secondary structures with pseudoknots with a relevant energy function is practically intractable Formally, this problem belongs to the class of NP-hard problems, that is among the most difficult problems in computer science. Meanwhile, algorithms exist for some restricted classes of pseufoknots. This is the case of pknotsrg. But these algorithms are computationally expensive. 45 Master BIBS

46 A summary of prediction algorithms, with or without pseudoknots Without pseudoknots Year Authors Complexity 1981 Zuker, Stiegler O(n 4 ) 1999 Lingsǿ, Zucker, Pedersen O(n 3 ) RNAfold, mfold With pseudoknots 1999 Rivas, Eddy O(n 6 ) 2000 Lyngsǿ, Pedersen O(n 5 ) 2000 Akutsu, Uemura O(n 5 ) 2003 Dirks, Pierce O(n 5 ) 2004 Reeder, Giegerich O(n 4 ) pknotsrg 2009 Cao, Chen O(n 6 ) 46 Master BIBS

47 Prediction by Homology 47 Master BIBS

48 Prediction by homology Data : several homologous RNA sequences. Output : a consensus structure for this set of sequences. 48 Master BIBS

49 1. From sequence alignment 49 Master BIBS

50 Detecting covariations We start from a sequence alignment: GAGGACTGAGCTCAGTTAAAGTGCCTG AAGGGCCCCGCTGGGCAAAG--GCTG AAGGGGTCGGCTGACCTAAAGTAGTTG GAGGGGTGAG-GCAUCTAAAGTGTTTG GAGGACTGTGCTCAGTTAAAGTGTTTG...((((...))))... We search for sequence covariations They come from compensatory mutations during the evolution 50 Master BIBS

51 Detecting covariations We start from a sequence alignment: GAGGACTGAGCTCAGTTAAAGTGCCTG AAGGGCCCCGCTGGGCAAAG--GCTG AAGGGGTCGGCTGACCTAAAGTAGTTG GAGGGGTGAG-GCAUCTAAAGTGTTTG GAGGACTGTGCTCAGTTAAAGTGTTTG...((((...))))... Measure : mutual information between positions i and j : - Pr(i=a) Pr(j=b) log(pr(i=a j=b)) a,b where a and b are the different nucleotides. 51 Master BIBS

52 Two softwares based on this approach RNA-alifold (Hofacker et al. 2000) RNAz (Washietl et al. 2005) 52 Master BIBS

53 Application : trna Alanine >Artibeus_jamaicensis AAGGGCTTAGCTTAATTAAAGTAGTTGATTTGCATTCAGCAGCTGTAGGATAAAGTCTTGCAGTCCTTA >Balaenoptera_musculus GAGGATTTAGCTTAATTAAAGTGTTTGATTTGCATTCAATTGATGTAAGATATAGTCTTGCAGTCCTTA >Bos_taurus GAGGATTTAGCTTAATTAAAGTGGTTGATTTGCATTCAATTGATGTAAGGTGTAGTCTTGCAATCCTTA >Canis_familiaris GAGGGCTTAGCTTAATTAAAGTGTTTGATTTGCATTCAATTGATGTAAGATAGATTCTTGCAGCCCTTA >Ceratotherium_simum GAGGGTTTAGCTTAATTAAAGTGTTTGATTTGCATTCAGTTGATGTAAGATAGAGTCTTGCAGCCCTTA >Dasypus_novemcinctus GAGGACTTAGCTTAATTAAAGTGCCTGATTTGCGTTCAGGAGATGTGGGGCTAAATCTTGCAGTCCTTA >Equus_asinus AAGGGCTTAGCTTAATGAAAGTGTTTGATTTGCGTTCAATTGATGTGAGATAGAGTCTTGCAGTCCTTA >Erinaceus_europeus GAGGATTTAGCTTAAAAAAAGTGGTTGATTTGCATTCAATTGATATAGGAAATATAATCTTGTAATCCTTA >Felis_catus GAGGACTTAGCTTAATTAAAGTGTTTGATTTGCAATCAATTGATGTAAGATAGATTCTTGCAGTCCTTA >Hippopotamus_amphibius AGGGACTTAGCTTAATAAAAGCAGTTGAGTTGCATTCAATTGATGTGAGGTGCGGTCTTGCAGTCTCTA >Homo_sapiens AAGGGCTTAGCTTAATTAAAGTGGCTGATTTGCGTTCAGTTGATGCAGAGTGGGGTTTTGCAGTCCTTA 53 Master BIBS

54 Exercise 1. Compute an alignment of the previous sequences, by using ClustalW or ClustalO: (do not forget to put the «DNA» option) 2. Copy/paste the result in RNAalifold : 3. Look at the result. 54 Master BIBS

55 Application : trna H.sapiens >Homo_sapiensArg TGGTATATAGTTTAAACAAAACGAATGATTTCGACTCATTAAATTATGATAATCATATTTACCAA >Homo_sapiensAsn TAGATTGAAGCCAGTTGATTAGGGTGCTTAGCTGTTAACTAAGTGTTTGTGGGTTTAAGTCCCATTGGTCTAG >Homo_sapiensAsp AAGGTATTAGAAAAACCATTTCATAACTTTGTCAAAGTTAAATTATAGGCTAAATCCTATATATCTTA >Homo_sapiensCys AGCTCCGAGGTGATTTTCATATTGAATTGCAAATTCGAAGAAGCAGCTTCAAACCTGCCGGGGCTT >Homo_sapiensGln TAGGATGGGGTGTGATAGGTGGCACGGAGAATTTTGGATTCTCAGGGATGGGTTCGATTCTCATAGTCCTAG >Homo_sapiensGlu GTTCTTGTAGTTGAAATACAACGATGGTTTTTCATATCATTGGTCGTGGTTGTAGTCCGTGCGAGAATA >Homo_sapiensGly ACTCTTTTAGTATAAATAGTACCGTTAACTTCCAATTAACTAGTTTTGACAACATTCAAAAAAGAGTA >Homo_sapiensHis GTAAATATAGTTTAACCAAAACATCAGATTGTGAATCTGACAACAGAGGCTTACGACCCCTTATTTACC >Homo_sapiensIso AGAAATATGTCTGATAAAAGAGTTACTTTGATAGAGTAAATAATAGGAGCTTAAACCCCCTTATTTCTA >Homo_sapiensLeuCun ACTTTTAAAGGATAACAGCTATCCATTGGTCTTAGGCCCCAAAAATTTTGGTGCAACTCCAAATAAAAGTA 55 Master BIBS

56 Exercise The same as previously, but with these new sequences. 1. Compute an alignment of the previous sequences, by using ClustalW or ClustalO: (do not forget to put the «DNA» option) 2. Copy/paste the result in RNAalifold : 3. Look at the result. What happened? Why? 56 Master BIBS

57 Simultaneous folding and alignment 57 Master BIBS

58 Problem specification Data : a set of sequences Output : a sequence alignment, and a common secondary structure. 58 Master BIBS

59 Approaches The reference approach: Sankoff s algorithm (1985) Algorithmic approach: dynamic programming Complexity : n 3k for k sequences of length n Two implementations (with constraints) Foldalign (Gorodkin, Heyer, Stormo 1997, Havgaard, Lyngso, Stormo, Gorodkin 2005). Dynalign (Mathews, Turner 2002) Heuristics based on this algorithm : LocaRNA ( 59 Master BIBS

60 Exercise 1. Take the two previous sets of sequences (one after the other) and run LocARNA. Look at the results. 2. Consider the first set only. Run LocARNA with the first two sequences, then the first three, and so on. How many sequences do you need to get the right trna structure? 60 Master BIBS

61 Sankoff s algorithm in a few words : Data : a set of sequences Parameters : a score matrix, giving a score S ij,kl for each alignment of pairs of nucleotides. Output : a sequence alignment, and a common secondary structure. Method : dynamic programming. It is a bit complicated, so we will study a simplified version of the algorithm : Foldalign. Two sequences only No multiloop allowed in the secondary structure Simplified score matrix 61 Master BIBS

62 Recurrence relation for Foldalign 62 Master BIBS

98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006

98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006 98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006 8.3.1 Simple energy minimization Maximizing the number of base pairs as described above does not lead to good structure predictions.

More information

RNA Basics. RNA bases A,C,G,U Canonical Base Pairs A-U G-C G-U. Bases can only pair with one other base. wobble pairing. 23 Hydrogen Bonds more stable

RNA Basics. RNA bases A,C,G,U Canonical Base Pairs A-U G-C G-U. Bases can only pair with one other base. wobble pairing. 23 Hydrogen Bonds more stable RNA STRUCTURE RNA Basics RNA bases A,C,G,U Canonical Base Pairs A-U G-C G-U wobble pairing Bases can only pair with one other base. 23 Hydrogen Bonds more stable RNA Basics transfer RNA (trna) messenger

More information

13 Comparative RNA analysis

13 Comparative RNA analysis 13 Comparative RNA analysis Sources for this lecture: R. Durbin, S. Eddy, A. Krogh und G. Mitchison, Biological sequence analysis, Cambridge, 1998 D.W. Mount. Bioinformatics: Sequences and Genome analysis,

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri RNA Structure Prediction Secondary

More information

RNA Structure Prediction and Comparison. RNA folding

RNA Structure Prediction and Comparison. RNA folding RNA Structure Prediction and Comparison Session 3 RNA folding Faculty of Technology robert@techfak.uni-bielefeld.de Bielefeld, WS 2013/2014 Base Pair Maximization This was the first structure prediction

More information

Predicting RNA Secondary Structure

Predicting RNA Secondary Structure 7.91 / 7.36 / BE.490 Lecture #6 Mar. 11, 2004 Predicting RNA Secondary Structure Chris Burge Review of Markov Models & DNA Evolution CpG Island HMM The Viterbi Algorithm Real World HMMs Markov Models for

More information

Computing the partition function and sampling for saturated secondary structures of RNA, with respect to the Turner energy model

Computing the partition function and sampling for saturated secondary structures of RNA, with respect to the Turner energy model Computing the partition function and sampling for saturated secondary structures of RNA, with respect to the Turner energy model J. Waldispühl 1,3 P. Clote 1,2, 1 Department of Biology, Higgins 355, Boston

More information

RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17

RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 Dr. Stefan Simm, 01.11.2016 simm@bio.uni-frankfurt.de RNA secondary structures a. hairpin loop b. stem c. bulge loop d. interior loop e. multi

More information

RNA Secondary Structure Prediction

RNA Secondary Structure Prediction RN Secondary Structure Prediction Perry Hooker S 531: dvanced lgorithms Prof. Mike Rosulek University of Montana December 10, 2010 Introduction Ribonucleic acid (RN) is a macromolecule that is essential

More information

Combinatorial approaches to RNA folding Part II: Energy minimization via dynamic programming

Combinatorial approaches to RNA folding Part II: Energy minimization via dynamic programming ombinatorial approaches to RNA folding Part II: Energy minimization via dynamic programming Matthew Macauley Department of Mathematical Sciences lemson niversity http://www.math.clemson.edu/~macaule/ Math

More information

CS681: Advanced Topics in Computational Biology

CS681: Advanced Topics in Computational Biology CS681: Advanced Topics in Computational Biology Can Alkan EA224 calkan@cs.bilkent.edu.tr Week 10 Lecture 1 http://www.cs.bilkent.edu.tr/~calkan/teaching/cs681/ RNA folding Prediction of secondary structure

More information

BCB 444/544 Fall 07 Dobbs 1

BCB 444/544 Fall 07 Dobbs 1 BCB 444/544 Required Reading (before lecture) Lecture 25 Mon Oct 15 - Lecture 23 Protein Tertiary Structure Prediction Chp 15 - pp 214-230 More RNA Structure Wed Oct 17 & Thurs Oct 18 - Lecture 24 & Lab

More information

Moments of the Boltzmann distribution for RNA secondary structures

Moments of the Boltzmann distribution for RNA secondary structures Bulletin of Mathematical Biology 67 (2005) 1031 1047 www.elsevier.com/locate/ybulm Moments of the Boltzmann distribution for RNA secondary structures István Miklós a, Irmtraud M. Meyer b,,borbála Nagy

More information

RNA secondary structure prediction. Farhat Habib

RNA secondary structure prediction. Farhat Habib RNA secondary structure prediction Farhat Habib RNA RNA is similar to DNA chemically. It is usually only a single strand. T(hyamine) is replaced by U(racil) Some forms of RNA can form secondary structures

More information

RNA Search and! Motif Discovery" Genome 541! Intro to Computational! Molecular Biology"

RNA Search and! Motif Discovery Genome 541! Intro to Computational! Molecular Biology RNA Search and! Motif Discovery" Genome 541! Intro to Computational! Molecular Biology" Day 1" Many biologically interesting roles for RNA" RNA secondary structure prediction" 3 4 Approaches to Structure

More information

DNA/RNA Structure Prediction

DNA/RNA Structure Prediction C E N T R E F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 12 DNA/RNA Structure Prediction Epigenectics Epigenomics:

More information

RNA Folding Algorithms. Michal Ziv-Ukelson Ben Gurion University of the Negev

RNA Folding Algorithms. Michal Ziv-Ukelson Ben Gurion University of the Negev RNA Folding Algorithms Michal Ziv-Ukelson Ben Gurion University of the Negev The RNA Folding Problem: Given an RNA sequence, predict its energetically most stable structure (minimal free energy). AUCCCCGUAUCGAUC

More information

Bioinformatics Advance Access published July 14, Jens Reeder, Robert Giegerich

Bioinformatics Advance Access published July 14, Jens Reeder, Robert Giegerich Bioinformatics Advance Access published July 14, 2005 BIOINFORMATICS Consensus Shapes: An Alternative to the Sankoff Algorithm for RNA Consensus Structure Prediction Jens Reeder, Robert Giegerich Faculty

More information

RNA Folding Algorithms. Michal Ziv-Ukelson Ben Gurion University of the Negev

RNA Folding Algorithms. Michal Ziv-Ukelson Ben Gurion University of the Negev RNA Folding Algorithms Michal Ziv-Ukelson Ben Gurion University of the Negev The RNA Folding Problem: Given an RNA sequence, predict its energetically most stable structure (minimal free energy). AUCCCCGUAUCGAUC

More information

proteins are the basic building blocks and active players in the cell, and

proteins are the basic building blocks and active players in the cell, and 12 RN Secondary Structure Sources for this lecture: R. Durbin, S. Eddy,. Krogh und. Mitchison, Biological sequence analysis, ambridge, 1998 J. Setubal & J. Meidanis, Introduction to computational molecular

More information

Classified Dynamic Programming

Classified Dynamic Programming Bled, Feb. 2009 Motivation Our topic: Programming methodology A trade-off in dynamic programming between search space design and evaluation of candidates A trade-off between modifying your code and adding

More information

Combinatorial approaches to RNA folding Part I: Basics

Combinatorial approaches to RNA folding Part I: Basics Combinatorial approaches to RNA folding Part I: Basics Matthew Macauley Department of Mathematical Sciences Clemson University http://www.math.clemson.edu/~macaule/ Math 4500, Spring 2015 M. Macauley (Clemson)

More information

De novo prediction of structural noncoding RNAs

De novo prediction of structural noncoding RNAs 1/ 38 De novo prediction of structural noncoding RNAs Stefan Washietl 18.417 - Fall 2011 2/ 38 Outline Motivation: Biological importance of (noncoding) RNAs Algorithms to predict structural noncoding RNAs

More information

In Genomes, Two Types of Genes

In Genomes, Two Types of Genes In Genomes, Two Types of Genes Protein-coding: [Start codon] [codon 1] [codon 2] [ ] [Stop codon] + DNA codons translated to amino acids to form a protein Non-coding RNAs (NcRNAs) No consistent patterns

More information

Lecture 9:3 RNA Structure and Function

Lecture 9:3 RNA Structure and Function Lecture 9:3 RNA Structure and Function Day 9: Day June 4, 2003: 13:45 15:15 Marcel Turcotte, University of Ottawa Key Concepts - Structure and function. - Primary, secondary and tertiary structure. - Structure

More information

Characterising RNA secondary structure space using information entropy

Characterising RNA secondary structure space using information entropy Characterising RNA secondary structure space using information entropy Zsuzsanna Sükösd 1,2,3, Bjarne Knudsen 4, James WJ Anderson 5, Ádám Novák 5,6, Jørgen Kjems 2,3 and Christian NS Pedersen 1,7 1 Bioinformatics

More information

Detecting non-coding RNA in Genomic Sequences

Detecting non-coding RNA in Genomic Sequences Detecting non-coding RNA in Genomic Sequences I. Overview of ncrnas II. What s specific about RNA detection? III. Looking for known RNAs IV. Looking for unknown RNAs Daniel Gautheret INSERM ERM 206 & Université

More information

RNA Folding and Interaction Prediction: A Survey

RNA Folding and Interaction Prediction: A Survey RNA Folding and Interaction Prediction: A Survey Syed Ali Ahmed Graduate Center, City University of New York New York, NY November 19, 2015 Abstract The problem of computationally predicting the structure

More information

Predicting RNA Secondary Structure Using Profile Stochastic Context-Free Grammars and Phylogenic Analysis

Predicting RNA Secondary Structure Using Profile Stochastic Context-Free Grammars and Phylogenic Analysis Fang XY, Luo ZG, Wang ZH. Predicting RNA secondary structure using profile stochastic context-free grammars and phylogenic analysis. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 23(4): 582 589 July 2008

More information

BIOINF 4120 Bioinforma2cs 2 - Structures and Systems -

BIOINF 4120 Bioinforma2cs 2 - Structures and Systems - BIOINF 4120 Bioinforma2cs 2 - Structures and Systems - Oliver Kohlbacher Summer 2014 3. RNA Structure Part II Overview RNA Folding Free energy as a criterion Folding free energy of RNA Zuker- SCegler algorithm

More information

The Ensemble of RNA Structures Example: some good structures of the RNA sequence

The Ensemble of RNA Structures Example: some good structures of the RNA sequence The Ensemble of RNA Structures Example: some good structures of the RNA sequence GGGGGUAUAGCUCAGGGGUAGAGCAUUUGACUGCAGAUCAAGAGGUCCCUGGUUCAAAUCCAGGUGCCCCCU free energy in kcal/mol (((((((..((((...))))...((((...))))(((((...)))))))))))).

More information

Shape Based Indexing For Faster Search Of RNA Family Databases

Shape Based Indexing For Faster Search Of RNA Family Databases For Faster Search Of RNA Family Databases Stefan Janssen Jens Reeder Robert Giegerich 26. April 2008 RNA homology Why? build homologous groups find new group members How? sequence & structure Covariance

More information

CS612 - Algorithms in Bioinformatics

CS612 - Algorithms in Bioinformatics Fall 2017 Databases and Protein Structure Representation October 2, 2017 Molecular Biology as Information Science > 12, 000 genomes sequenced, mostly bacterial (2013) > 5x10 6 unique sequences available

More information

RNA Secondary Structure Prediction

RNA Secondary Structure Prediction RNA Secondary Structure Prediction 1 RNA structure prediction methods Base-Pair Maximization Context-Free Grammar Parsing. Free Energy Methods Covariance Models 2 The Nussinov-Jacobson Algorithm q = 9

More information

DANNY BARASH ABSTRACT

DANNY BARASH ABSTRACT JOURNAL OF COMPUTATIONAL BIOLOGY Volume 11, Number 6, 2004 Mary Ann Liebert, Inc. Pp. 1169 1174 Spectral Decomposition for the Search and Analysis of RNA Secondary Structure DANNY BARASH ABSTRACT Scales

More information

Bioinformatics Chapter 1. Introduction

Bioinformatics Chapter 1. Introduction Bioinformatics Chapter 1. Introduction Outline! Biological Data in Digital Symbol Sequences! Genomes Diversity, Size, and Structure! Proteins and Proteomes! On the Information Content of Biological Sequences!

More information

Sparse RNA Folding: Time and Space Efficient Algorithms

Sparse RNA Folding: Time and Space Efficient Algorithms Sparse RNA Folding: Time and Space Efficient Algorithms Rolf Backofen 1, Dekel Tsur 2, Shay Zakov 2, and Michal Ziv-Ukelson 2 1 Albert Ludwigs University, Freiburg, Germany backofen@informatik.uni-freiburg.de

More information

Computational Approaches for determination of Most Probable RNA Secondary Structure Using Different Thermodynamics Parameters

Computational Approaches for determination of Most Probable RNA Secondary Structure Using Different Thermodynamics Parameters Computational Approaches for determination of Most Probable RNA Secondary Structure Using Different Thermodynamics Parameters 1 Binod Kumar, Assistant Professor, Computer Sc. Dept, ISTAR, Vallabh Vidyanagar,

More information

A faster algorithm for RNA co-folding

A faster algorithm for RNA co-folding A faster algorithm for RNA co-folding Michal Ziv-Ukelson 1, Irit Gat-Viks 2, Ydo Wexler 3, and Ron Shamir 4 1 Computer Science Department, Ben Gurion University of the Negev, Beer-Sheva. 2 Computational

More information

Grand Plan. RNA very basic structure 3D structure Secondary structure / predictions The RNA world

Grand Plan. RNA very basic structure 3D structure Secondary structure / predictions The RNA world Grand Plan RNA very basic structure 3D structure Secondary structure / predictions The RNA world very quick Andrew Torda, April 2017 Andrew Torda 10/04/2017 [ 1 ] Roles of molecules RNA DNA proteins genetic

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

Lab III: Computational Biology and RNA Structure Prediction. Biochemistry 208 David Mathews Department of Biochemistry & Biophysics

Lab III: Computational Biology and RNA Structure Prediction. Biochemistry 208 David Mathews Department of Biochemistry & Biophysics Lab III: Computational Biology and RNA Structure Prediction Biochemistry 208 David Mathews Department of Biochemistry & Biophysics Contact Info: David_Mathews@urmc.rochester.edu Phone: x51734 Office: 3-8816

More information

Stable stem enabled Shannon entropies distinguish non-coding RNAs from random backgrounds

Stable stem enabled Shannon entropies distinguish non-coding RNAs from random backgrounds RESEARCH Stable stem enabled Shannon entropies distinguish non-coding RNAs from random backgrounds Open Access Yingfeng Wang 1*, Amir Manzour 3, Pooya Shareghi 1, Timothy I Shaw 3, Ying-Wai Li 4, Russell

More information

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB)

1. Protein Data Bank (PDB) 1. Protein Data Bank (PDB) Protein structure databases; visualization; and classifications 1. Introduction to Protein Data Bank (PDB) 2. Free graphic software for 3D structure visualization 3. Hierarchical classification of protein

More information

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

CMPS 3110: Bioinformatics. Tertiary Structure Prediction CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite

More information

Inferring Noncoding RNA Families and Classes by Means of Genome-Scale Structure-Based Clustering

Inferring Noncoding RNA Families and Classes by Means of Genome-Scale Structure-Based Clustering Inferring Noncoding RNA Families and Classes by Means of Genome-Scale Structure-Based Clustering Sebastian Will 1, Kristin Reiche 2, Ivo L. Hofacker 3, Peter F. Stadler 2, Rolf Backofen 1* 1 Bioinformatics

More information

RecitaLon CB Lecture #10 RNA Secondary Structure

RecitaLon CB Lecture #10 RNA Secondary Structure RecitaLon 3-19 CB Lecture #10 RNA Secondary Structure 1 Announcements 2 Exam 1 grades and answer key will be posted Friday a=ernoon We will try to make exams available for pickup Friday a=ernoon (probably

More information

COMBINATORICS OF LOCALLY OPTIMAL RNA SECONDARY STRUCTURES

COMBINATORICS OF LOCALLY OPTIMAL RNA SECONDARY STRUCTURES COMBINATORICS OF LOCALLY OPTIMAL RNA SECONDARY STRUCTURES ÉRIC FUSY AND PETER CLOTE Abstract. It is a classical result of Stein and Waterman that the asymptotic number of RNA secondary structures is 1.104366

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

Structure-Based Comparison of Biomolecules

Structure-Based Comparison of Biomolecules Structure-Based Comparison of Biomolecules Benedikt Christoph Wolters Seminar Bioinformatics Algorithms RWTH AACHEN 07/17/2015 Outline 1 Introduction and Motivation Protein Structure Hierarchy Protein

More information

Semi-Supervised CONTRAfold for RNA Secondary Structure Prediction: A Maximum Entropy Approach

Semi-Supervised CONTRAfold for RNA Secondary Structure Prediction: A Maximum Entropy Approach Wright State University CORE Scholar Browse all Theses and Dissertations Theses and Dissertations 2011 Semi-Supervised CONTRAfold for RNA Secondary Structure Prediction: A Maximum Entropy Approach Jianping

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the

More information

The wonderful world of RNA informatics

The wonderful world of RNA informatics December 9, 2012 Course Goals Familiarize you with the challenges involved in RNA informatics. Introduce commonly used tools, and provide an intuition for how they work. Give you the background and confidence

More information

DATA ACQUISITION FROM BIO-DATABASES AND BLAST. Natapol Pornputtapong 18 January 2018

DATA ACQUISITION FROM BIO-DATABASES AND BLAST. Natapol Pornputtapong 18 January 2018 DATA ACQUISITION FROM BIO-DATABASES AND BLAST Natapol Pornputtapong 18 January 2018 DATABASE Collections of data To share multi-user interface To prevent data loss To make sure to get the right things

More information

Extending the hypergraph analogy for RNA dynamic programming

Extending the hypergraph analogy for RNA dynamic programming Extending the hypergraph analogy for RN dynamic programming Yann Ponty Balaji Raman édric Saule Polytechnique/NRS/INRI MIB France Yann Ponty, Balaji Raman, édric Saule RN Folding RN = Biopolymer composed

More information

A Structure-Based Flexible Search Method for Motifs in RNA

A Structure-Based Flexible Search Method for Motifs in RNA JOURNAL OF COMPUTATIONAL BIOLOGY Volume 14, Number 7, 2007 Mary Ann Liebert, Inc. Pp. 908 926 DOI: 10.1089/cmb.2007.0061 A Structure-Based Flexible Search Method for Motifs in RNA ISANA VEKSLER-LUBLINSKY,

More information

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder HMM applications Applications of HMMs Gene finding Pairwise alignment (pair HMMs) Characterizing protein families (profile HMMs) Predicting membrane proteins, and membrane protein topology Gene finding

More information

Traditionally, the role of RNA in the cell was considered

Traditionally, the role of RNA in the cell was considered Fast and reliable prediction of noncoding RNAs Stefan Washietl*, Ivo L. Hofacker*, and Peter F. Stadler* *Department of Theoretical Chemistry and Structural Biology, University of Vienna, Währingerstrasse

More information

Conserved RNA Structures. Ivo L. Hofacker. Institut for Theoretical Chemistry, University Vienna.

Conserved RNA Structures. Ivo L. Hofacker. Institut for Theoretical Chemistry, University Vienna. onserved RN Structures Ivo L. Hofacker Institut for Theoretical hemistry, University Vienna http://www.tbi.univie.ac.at/~ivo/ Bled, January 2002 Energy Directed Folding Predict structures from sequence

More information

IMPROVEMENT OF STRUCTURE CONSERVATION INDEX WITH CENTROID ESTIMATORS

IMPROVEMENT OF STRUCTURE CONSERVATION INDEX WITH CENTROID ESTIMATORS 1 IMPROVEMENT OF STRUCTURE CONSERVATION INDEX WITH CENTROID ESTIMATORS YOHEI OKADA Department of Biosciences and Informatics, Keio University, 3 14 1 Hiyoshi, Kohoku-ku, Yokohama, Kanagawa 223 8522, Japan

More information

Genome 559 Wi RNA Function, Search, Discovery

Genome 559 Wi RNA Function, Search, Discovery Genome 559 Wi 2009 RN Function, Search, Discovery The Message Cells make lots of RN noncoding RN Functionally important, functionally diverse Structurally complex New tools required alignment, discovery,

More information

Lecture 12. DNA/RNA Structure Prediction. Epigenectics Epigenomics: Gene Expression

Lecture 12. DNA/RNA Structure Prediction. Epigenectics Epigenomics: Gene Expression C N F O N G A V B O N F O M A C S V U Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 12 DNA/NA Structure Prediction pigenectics pigenomics: Gene xpression ranscription factors

More information

Bioinformatics. Proteins II. - Pattern, Profile, & Structure Database Searching. Robert Latek, Ph.D. Bioinformatics, Biocomputing

Bioinformatics. Proteins II. - Pattern, Profile, & Structure Database Searching. Robert Latek, Ph.D. Bioinformatics, Biocomputing Bioinformatics Proteins II. - Pattern, Profile, & Structure Database Searching Robert Latek, Ph.D. Bioinformatics, Biocomputing WIBR Bioinformatics Course, Whitehead Institute, 2002 1 Proteins I.-III.

More information

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB Homology Modeling (Comparative Structure Modeling) Aims of Structural Genomics High-throughput 3D structure determination and analysis To determine or predict the 3D structures of all the proteins encoded

More information

Impact Of The Energy Model On The Complexity Of RNA Folding With Pseudoknots

Impact Of The Energy Model On The Complexity Of RNA Folding With Pseudoknots Impact Of The Energy Model On The omplexity Of RN Folding With Pseudoknots Saad Sheikh, Rolf Backofen Yann Ponty, niversity of Florida, ainesville, S lbert Ludwigs niversity, Freiburg, ermany LIX, NRS/Ecole

More information

RNA Secondary Structure Prediction: taking conservation into account

RNA Secondary Structure Prediction: taking conservation into account RNA Secondary Structure Prediction: taking conservation into account 1 Assumptions of the RNA secondary structure prediction algorithm, based on MFE: 1. The most likely structure of the RNA molecule is

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE Examples of Protein Modeling Protein Modeling Visualization Examination of an experimental structure to gain insight about a research question Dynamics To examine the dynamics of protein structures To

More information

Comparing whole genomes

Comparing whole genomes BioNumerics Tutorial: Comparing whole genomes 1 Aim The Chromosome Comparison window in BioNumerics has been designed for large-scale comparison of sequences of unlimited length. In this tutorial you will

More information

Molecular Modeling Lecture 7. Homology modeling insertions/deletions manual realignment

Molecular Modeling Lecture 7. Homology modeling insertions/deletions manual realignment Molecular Modeling 2018-- Lecture 7 Homology modeling insertions/deletions manual realignment Homology modeling also called comparative modeling Sequences that have similar sequence have similar structure.

More information

Introduction to Evolutionary Concepts

Introduction to Evolutionary Concepts Introduction to Evolutionary Concepts and VMD/MultiSeq - Part I Zaida (Zan) Luthey-Schulten Dept. Chemistry, Beckman Institute, Biophysics, Institute of Genomics Biology, & Physics NIH Workshop 2009 VMD/MultiSeq

More information

Markov Chains and Hidden Markov Models. = stochastic, generative models

Markov Chains and Hidden Markov Models. = stochastic, generative models Markov Chains and Hidden Markov Models = stochastic, generative models (Drawing heavily from Durbin et al., Biological Sequence Analysis) BCH339N Systems Biology / Bioinformatics Spring 2016 Edward Marcotte,

More information

Hidden Markov Models in computational biology. Ron Elber Computer Science Cornell

Hidden Markov Models in computational biology. Ron Elber Computer Science Cornell Hidden Markov Models in computational biology Ron Elber Computer Science Cornell 1 Or: how to fish homolog sequences from a database Many sequences in database RPOBESEQ Partitioned data base 2 An accessible

More information

Sparse RNA Folding Revisited: Space-Efficient Minimum Free Energy Prediction

Sparse RNA Folding Revisited: Space-Efficient Minimum Free Energy Prediction Sparse RNA Folding Revisited: Space-Efficient Minimum Free Energy Prediction Sebastian Will 1 and Hosna Jabbari 2 1 Bioinformatics/IZBI, University Leipzig, swill@csail.mit.edu 2 Ingenuity Lab, National

More information

Visualization of Macromolecular Structures

Visualization of Macromolecular Structures Visualization of Macromolecular Structures Present by: Qihang Li orig. author: O Donoghue, et al. Structural biology is rapidly accumulating a wealth of detailed information. Over 60,000 high-resolution

More information

Chapter 1. A Method to Predict the 3D Structure of an RNA Scaffold. Xiaojun Xu and Shi-Jie Chen. Abstract. 1 Introduction

Chapter 1. A Method to Predict the 3D Structure of an RNA Scaffold. Xiaojun Xu and Shi-Jie Chen. Abstract. 1 Introduction Chapter 1 Abstract The ever increasing discoveries of noncoding RNA functions draw a strong demand for RNA structure determination from the sequence. In recently years, computational studies for RNA structures,

More information

RNA Secondary Structure Prediction: taking conservation into account

RNA Secondary Structure Prediction: taking conservation into account RNA Secondary Structure Prediction: taking conservation into account 1 13 June 2006 2 Main approaches to RNA secondary structure prediction Energy minimization (Single-strand Folding) does not require

More information

Supplementary Data for A Pipeline for Computational Design of Novel RNA-like Topologies

Supplementary Data for A Pipeline for Computational Design of Novel RNA-like Topologies Supplementary Data for A Pipeline for Computational Design of Novel RNA-like Topologies Swati Jain 1, Alain Laederach 2, Silvia B. V. Ramos 3, and Tamar Schlick 1,4,5,* 1 Department of Chemistry, New York

More information

Algorithms in Computational Biology (236522) spring 2008 Lecture #1

Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: 15:30-16:30/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office hours:??

More information

SA-REPC - Sequence Alignment with a Regular Expression Path Constraint

SA-REPC - Sequence Alignment with a Regular Expression Path Constraint SA-REPC - Sequence Alignment with a Regular Expression Path Constraint Nimrod Milo Tamar Pinhas Michal Ziv-Ukelson Ben-Gurion University of the Negev, Be er Sheva, Israel Graduate Seminar, BGU 2010 Milo,

More information

RNA evolution and Genotype to Phenotype maps

RNA evolution and Genotype to Phenotype maps RNA evolution and Genotype to Phenotype maps E.S. Colizzi November 8, 2018 Introduction Biological evolution occurs in a population because 1) different genomes can generate different reproductive success

More information

RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES

RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES Molecular Biology-2018 1 Definitions: RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES Heterologues: Genes or proteins that possess different sequences and activities. Homologues: Genes or proteins that

More information

Candidates for Novel RNA Topologies

Candidates for Novel RNA Topologies doi:10.1016/j.jmb.2004.06.054 J. Mol. Biol. (2004) 341, 1129 1144 Candidates for Novel RNA Topologies Namhee Kim 1, Nahum Shiffeldrim 1, Hin Hark Gan 1 and Tamar Schlick 1,2 * 1 Department of Chemistry

More information

Hidden Markov Models and Their Applications in Biological Sequence Analysis

Hidden Markov Models and Their Applications in Biological Sequence Analysis Hidden Markov Models and Their Applications in Biological Sequence Analysis Byung-Jun Yoon Dept. of Electrical & Computer Engineering Texas A&M University, College Station, TX 77843-3128, USA Abstract

More information

Today s Lecture: HMMs

Today s Lecture: HMMs Today s Lecture: HMMs Definitions Examples Probability calculations WDAG Dynamic programming algorithms: Forward Viterbi Parameter estimation Viterbi training 1 Hidden Markov Models Probability models

More information

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Bioinformatics. Dept. of Computational Biology & Bioinformatics Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS

More information

A Method for Aligning RNA Secondary Structures

A Method for Aligning RNA Secondary Structures Method for ligning RN Secondary Structures Jason T. L. Wang New Jersey Institute of Technology J Liu, JTL Wang, J Hu and B Tian, BM Bioinformatics, 2005 1 Outline Introduction Structural alignment of RN

More information

Using SetPSO to determine RNA secondary structure

Using SetPSO to determine RNA secondary structure Using SetPSO to determine RNA secondary structure by Charles Marais Neethling Submitted in partial fulfilment of the requirements for the degree of Master of Science (Computer Science) in the Faculty of

More information

Sparse RNA folding revisited: space efficient minimum free energy structure prediction

Sparse RNA folding revisited: space efficient minimum free energy structure prediction DOI 10.1186/s13015-016-0071-y Algorithms for Molecular Biology RESEARCH ARTICLE Sparse RNA folding revisited: space efficient minimum free energy structure prediction Sebastian Will 1* and Hosna Jabbari

More information

BIOINFORMATICS. Prediction of RNA secondary structure based on helical regions distribution

BIOINFORMATICS. Prediction of RNA secondary structure based on helical regions distribution BIOINFORMATICS Prediction of RNA secondary structure based on helical regions distribution Abstract Motivation: RNAs play an important role in many biological processes and knowing their structure is important

More information

Boltzmann probability of RNA structural neighbors and riboswitch detection

Boltzmann probability of RNA structural neighbors and riboswitch detection Boltzmann probability of RNA structural neighbors and riboswitch detection Eva Freyhult 1, Vincent Moulton 2 Peter Clote 3, 1 Linnaeus Centre for Bioinformatics, University of Uppsala, Sweden, eva.freyhult@lcb.uu.se.

More information

Junction-Explorer Help File

Junction-Explorer Help File Junction-Explorer Help File Dongrong Wen, Christian Laing, Jason T. L. Wang and Tamar Schlick Overview RNA junctions are important structural elements of three or more helices in the organization of the

More information

CISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I)

CISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I) CISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I) Contents Alignment algorithms Needleman-Wunsch (global alignment) Smith-Waterman (local alignment) Heuristic algorithms FASTA BLAST

More information

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University General informations (1) Office hours: by appointment Office: TR3018

More information

Quantifying sequence similarity

Quantifying sequence similarity Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

More information

On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar

On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar Proceedings of Machine Learning Research vol 73:153-164, 2017 AMBN 2017 On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar Kei Amii Kyoto University Kyoto

More information

Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment

Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Introduction to Bioinformatics online course : IBT Jonathan Kayondo Learning Objectives Understand

More information

Investigation of phylogenetic relationships using microrna sequences and secondary structures

Investigation of phylogenetic relationships using microrna sequences and secondary structures Investigation of phylogenetic relationships using microrna sequences and secondary structures Author: Rohit Dnyansagar a07rohdn@student.his.se Supervisor: Angelica Lindlöf angelica.lindlof@his.se School

More information

TurboFold II: RNA structural alignment and secondary structure prediction informed by multiple homologs

TurboFold II: RNA structural alignment and secondary structure prediction informed by multiple homologs 11570 11581 Nucleic Acids Research, 2017, Vol. 45, No. 20 Published online 28 September 2017 doi: 10.1093/nar/gkx815 TurboFold II: RNA structural alignment and secondary structure prediction informed by

More information

Journal of Discrete Algorithms

Journal of Discrete Algorithms Journal of Discrete Algorithms 9 (2011) 2 11 Contents lists available at ScienceDirect Journal of Discrete Algorithms www.elsevier.com/locate/da Fast RNA structure alignment for crossing input structures

More information