Boltzmann probability of RNA structural neighbors and riboswitch detection

Size: px
Start display at page:

Download "Boltzmann probability of RNA structural neighbors and riboswitch detection"

Transcription

1 Boltzmann probability of RNA structural neighbors and riboswitch detection Eva Freyhult 1, Vincent Moulton 2 Peter Clote 3, 1 Linnaeus Centre for Bioinformatics, University of Uppsala, Sweden, eva.freyhult@lcb.uu.se. 2 School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK, vincent.moulton@cmp.uea.ac.u. 3 Department of Biology, Boston College, Chestnut Hill, MA 02467, USA, clote@bc.edu. Corresponding author Abstract Given an RNA nucleotide sequence s, let S 0 be any secondary structure s. S 0 could be the minimum free energy structure of s, it could be the secondary structure obtained by analysis of the X-ray structure or by comparative sequence analysis, or it could be an arbitrary intermediate structure. Another secondary structure S of s is called a δ-neighbor of S 0 if S and S 0 differ by exactly δ base pairs. Here we describe a new software pacage, RNAbor, to compute the number N δ, the Boltzmann partition function Z δ and the minimum free energy structure MFE δ over the collection of all δ-neighbors of S 0. This computation is done simultaneously for all δ m, in run time O(mn 3 ) and memory O(mn 2 ). Our novel algorithms depend on a new manner of partitioning up the space of secondary structures, nested multiple recursions and dynamic programming. Computations are done with the Turner nearest neighbor energy parameters, nown for its success in ab initio secondary structure prediction. We apply RNAbor to automatic detection of possible RNA conformational switches, and compare RNAbor to existent switch detection methods. Public access to our software, RNAbor (RNA neighbor), is provided by a web server available at 1 Introduction In the last few years, there has been intense interest in RNA due to the surprising, previously unsuspected roles played by ribonucleic acid in what until now has been a predominantly protein-centric view of molecular biology. Apart from its roles as messenger RNA and transfer RNA, ribonucleic acid molecules play a catalytic role in the peptidyltransferase reaction in peptide bond formation (??) and in intron splicing (?), both examples of enzymatic RNAs now termed ribonucleic enzymes or ribozymes (?). RNA plays a role in post-transcriptional gene regulation due to the hybridization of mrna by small interfering RNAs 1

2 (sirna) (??) and micro-rnas (mirna) (?). By completely different means, RNA performs transcriptional and translational gene regulation by allostery, where a portion of the 5 untranslated region (5 UTR) of mrna nown as a riboswitch (??) can undergo a conformational change upon binding a specific ligand such as adenine, guanine, lysine, etc. RNA is nown to play critical roles in various other cellular mechanisms including dosage compensation (?), protein shuttling (?), retranslation events such as selenocysteine insertion (?) and ribosomal frameshift (??), etc. Illustrative of the growing recognition for the importance of RNA, the 2006 Nobel Prize in Physiology or Medicine was awarded to A.Z. Fire and C.C. Mello for their discovery of RNA interference and gene silencing by double-stranded RNA. The function of noncoding RNA 1 and cis-regulatory motifs depends on the RNA tertiary structure, which is largely determined by secondary structure involving Watson-Cric and GU wobble base pair stacing, hairpin loops, bulges, interior and multiloops (?). For this reason, various groups have tried to develop a noncoding RNA (ncrna) genefinder based on the fact that ncrna has been shown to have lower folding energy than random RNA (???). One of the most successful such algorithms is the moving window ncrna genefinder RNAz, developed by Washietl, Hofacer and Stadler (?). RNAz uses comparative genomics and a support vector machine to detect whether current window contents structurally align with nown ncrnas and have lower folding energy than random RNA. 2 In this paper, we develop novel and efficient algorithms to compute both the number N δ (s, S 0 ) and the partition function Z δ (s, S 0 ) = S exp( E(S)/RT ) of all δ-neighbors S of s, S 0, where E(S) denotes the energy of S with respect to the Turner nearest neighbor energy model, R is the universal gas constant, and T is temperature in degrees Kelvin. Our software, called RNAbor (RNA neighbor), additionally computes graphs of the probability density function p δ = Z δ /Z as a function of δ. As shown in various figures, we can computationally detect low energy secondary structures which differ structurally from that of S 0. In cases where there is too little sequence identity with nown examples of ncrna for application of programs such as RNAz (?) and Dynalign (?), our software RNAbor suggests some promise as a tool capable of ab initio detection of riboswitches and pseudonots, a topic to be explored in future wor. RNAbor was motivated by Moulton et al. (?) who suggested that the stability of a secondary structure might depend on the number of structural neighbors at varying distances from the given structure for instance from the minimum free energy (MFE) structure. It turns out that the number of structural neighbors at varying distances is not sufficient to distinguish between structural RNA and random RNA having the same mono- and dinucleotide frequence; see Figure??. However, we do see a distinction when computing a weighted count of structural neighbors, where low energy structures are more heavily weighted. Formally, this is the Boltzmann partition function with respect to all structural neighbors at a given base pair distance δ. Figure?? displays a density plot produced by 1 Noncoding RNA (ncrna) is transcribed but does not code a protein. Examples of ncrna are trna, rrna, riboswitches, ribozymes, mirna, etc. 2 Rather than randomizing the RNA sequence in the current window, a multiple sequence alignment of the moving window contents with nown ncrnas is randomized (?) by column permutation. In contrast to randomization of single RNA sequences, the Z-scores thus obtained are statistically significant. 2

3 Number of structures 1e+00 1e+08 1e ! ! (a) Number of neighbors. (b) Density plot. Figure 1: The number and probability of neighboring structures at varying distances from the minimum free energy (MFE) structure of the precursor microrna dme-mir-1 (AE / ) from Drosophila melanogaster (?) (solid line) and a random RNA having the same secondary structure (dashdotted line). The random RNA was obtained by applying RNAinverse with the MFE structure of dme-mir-1 as input structure. In unshown data, we have appied RNAbor to random RNA generated by the Altschul-Erison algorithm (?), which preserves the same mono- and dinucleotides of the given RNA. (See Worman and Krogh (?) for an argument why random RNA should be generated so as to preserve dinucleotide frequency, and see (??) for data supporting the fact that structural RNA has lower folding energy than random RNA.) RNAbor which clearly suggests alternate low energy secondary structures with radically different topologies. The plan of this paper is as follows. In section??, we present graphs of the the number N δ and the Boltzmann probability density p δ = Z δ /Z of structural neighbors, which differ by δ base pairs from a given secondary structure. We compare the output of our program RNAbor with the program parnass of Giegerich and co-worers (?) when run on a wide variety of example RNAs, some nown switches and some nown non-switches. In section??, we discuss the data presented, and in section??, we describe the algorithms to compute the number N δ, the Boltzmann partition function Z δ and the minimum free energy structure MFE δ over the collection of all δ-neighbors. Pseudocode for our implemententation is presented in the appendix. Additional data obtained by running RNAbor on all SAM riboswitches from Rfam (?) is available in the web supplement bioinformatics.bc.edu/clotelab/rnabor/websupplement. 2 Results In this section, we present probability density graphs for a variety of conformational switches and for some non-switches. Additional data is provided for all 3

4 SAM riboswitches at the web supplement bioinformatics.bc.edu/clotelab/rnabor/websupplement/. We additionally compare RNAbor with the web server parnass (?), which latter uses a heuristic to determine whether there appear to be two or more clusters of distinct secondary structures for a given RNA sequence. In (???) Giegerich and co-worers compute the Boltzmann partition function for RNAshapes, which are classes of secondary structures having the same topology. For instance, [[][][]] is the shape of the usual cloverleaf secondary structure for trna. The data we present compares the output of RNAbor with that of parnass and RNAshapes. 2.1 Detecting conformational switches In this section, we define a conformational switch to be an RNA sequence which has exactly two distinct low energy secondary structures. By multi-switch we mean an RNA sequence which can adopt two or more distinct low energy secondary structures. For given RNA sequence s = s 1,..., s n and secondary structure S 0 of s, we use RNAbor to compute p δ (s, S 0 ) = Z δ (s, S 0 )/Z(s, S 0 ). Taing S 0 to be the minimum free energy structure, or alternatively the structure determined by comparative sequence alignment (?), our intuition is that a conformational switch should display a bimodal probability density graph. To illustrate the behavior of a typical conformational switch, we consider the nown 47 nt. switch (?) with EMBL accession number AE / and sequence GUGACUGCAA UGCUAUUUGA GUAUCCUGAA AACGGGCUUU UCAGAAU. This conformational switch, which involves a pseudonotted structure, surrounds the bacterial alpha operon ribosome binding site and can fold into two distinct structures as illustrated in Figure??. These are at a base-pair distance of 23 from each other and their energies, as computed by RNAfold -d2, are cal/mol and cal/mol, respectively. Figure?? shows the density plot, i.e. the probability p δ = Z δ /Z of finding a structure at a distance δ from the input structure, which in this example is the MFE structure...((((((...((((...))))...)))))).., the structure shown in Figure??(A) Figure?? depicts a similar bi-modal density graph for the artificially engineered bistable switch CUUAUGAGGGUACUCAUAAGAGUAUCC of Flamm et al. (?). In Figures?? and??, we analyze the 76 nt. conformational switch with PDB ID 1SJ3:R (?), which controls hepatitis delta virus ribozyme catalysis. The sequence and secondary structure of this switch is as follows. 3 GAUGGCCGGCAUGGUCCCAGCCUCCUCGCUGGCGCCGGCUGGGCAACACCAUUGCACUCCGGUGGUGAAUGGGACU...((((((...[.[[[.(((...)))))))))...((.((...)).))...]]].]. Note that the secondary structure contains 13 base pairs, including the noncanonical base pair (A,G) located at positions (46,65), and that there are four pseudonotted base pairs. RNAbor computes that the Boltzmann probability P r[0 neighhbors] of the MFE structure is , and that the next largest 3 The secondary structure is obtained by analysis of the hydrogen bonding classification from the Nucleic Acid DataBan for NDB ID PR0122 and by application of a program we wrote to extract the maximal planar secondary structure, denoted by parentheses, and subsequently to extract the pseudonots, denoted by square bracets. Idea of extraction algorithm is due to Yann Ponty, and will appear elsewhere. 4

5 C A G U U G G C A A UG U A A G C U A U U A UG C U A U U U G C U G A U C G CU G C A G A A A U G U A G C GU C A G U A G A A A C U U C U A U U U G A G U C U U U C G G A A U C A A A G C G (a) The MFE structure. (b) Figure 2: The two alternative secondary structures with free energies and cal/mol, respectively, of the primary structure GUGACUGCAA UGCUAUUUGA GUAUCCUGAA AACGGGCUUU UCAGAAU. Figure 3: Boltzmann probability density plot for the 47 nt. conformational switch (?) with EMBL accession number AE / and sequence s = GUGACUGCAA UGCUAUUUGA GUAUCCUGAA AACGGGCUUU UCAGAAU. The curve shows the probability, p δ = Z δ (s, S 0 )/Z(s, S 0 ), for all secondary structures of RNA sequence s having base pair distance δ from the MFE structure S 0. 5

6 0 0.7 Bistable switch designed by Flamm et al density of states base pair distance bistableswit C U U A U G A G G G U A C U C A U A A G A G U A U C C C U U A U G A G G G U A C U C A U A A G A G U A U C C C U U A U G A G G G U A C U C A U A A G A G U A U C C C U U A U G A G G G U A C U C A U A A G A G U A U C C Figure 4: (Left) Boltzmann probability density plot for the 29 nt. bistable switch artificially engineered by Flamm et al. (?) and having sequence CUUAUGAGGGUACUCAUAAGAGUAUCC. The graph shows the Boltzmann probability, p δ = Z δ (s, S 0 )/Z(s, S 0 ), of all δ-neighbors, for all values of δ bounded by sequence length. (Right) Dot plot produced by RNAfold -d2 -p. The upper triangular region represents base pair probabilities, as computed by McCasill s algorithm (?), while the lower triangular region represents base pairs from the MFE structure. 6

7 probability occurs for the collection of 18-neighbors of the MFE structure, with P r[18 neighhbors] = In addition to computing the partition function and Boltzmann probabilities for δ-neighbors, for all δ, RNAbor computes the MFE δ structures; i.e. for each δ, RNAbor computes the MFE structure among all δ-neighbors. G G G U G A G G U G AA U C C AA C G G G A G C U C C G U AC G G C U C C C G A U C C G C U G G C G C C G G C U G G A G U U C GC U G C A C C U C C U A U G G A G U G G A G G G U C C A G G U AC G G C U C C C G A U C C G C U G G C G C C G G C U G G A C C C A U UG C A C U C U CAG C G G U A A GC C U C (a) MFE structure. (b) 18-neighbor. Figure 5: Two alternative secondary structures for the 76 nt. conformational switch which controls hepatitis delta virus ribozyme catalysis. This switch has sequence GAUGGCCGGC AUGGUCCCAG CCUCCUCGCU GGCGCCGGCU GGGCAACACC AUUGCACUCC GGUGGUGAAU GGGACU. The 3-dimensional structure of this switch, as determined by X-ray crystallography (?), is available in the Protein Data- Ban with PDB code 1SJ3:R. (Left) Minimum free energy structure with free energy cal/mol. (Right) Alternative low energy structure with free energy cal/mol. This MFE 18 structure has the lowest free energy among all many 18-neighbors of the MFE structure parnass In parnass (?) a structural RNA switch is predicted by means of studying properties of the energy landscape of the RNA. Secondary structures are sampled from the structure space using RNAsubopt (?) or mfold (?). Pairwise distances are calculated between the sampled sequences using two different distance measures (e.g. energy barrier, morphological, tree alignment or string edit distance). Using a standard clustering method the structures are clustered into two clusters based on the distance measures. If the RNA is a conformational switch it has two stable structures and hence two clusters are expected (in a multi-switch, more than two stable structures are expected). As an additional test, the consensus structure of the clusters are computed and for each sample 7

8 0 0.3 RNAbor density of states density of states base pair distance (a) Density plot [][] [[][]] [][[][]] [] [][][] [[][]][] [[][][]] Figure 6: Density of δ-neighbors and output of RNAshapes for for the 76 nt. conformational switch which controls hepatitis delta virus ribozyme catalysis with PDB code 1SJ3:R. (?). (Sequence is given in Figure??.) (Left) RNAbor density plot which graphs the Boltzmann probability P r[δ neighhbors] of δ- neighbors as a function of δ. In this case, P r[0 neighhbors] = , and P r[18 neighhbors] = When comparing number of structures, there is one 0-neighbor, the MFE structure itself, many 18-neighbors, and many 38-neighbors (the largest class of δ-neighbors for all δ). By way of comparison the output of RNAshape is in the table below. Note that in this example, the two alternative low-energy structures displayed in this figure both have the RNA shape [][], and that RNAshapes does not predict a switch. 8

9 structure the distances to the two consensus structures are plotted against each other. If the RNA is really a conformational switch, then parnass output should display two clouds of points one near the x-axis and one near the y-axis Comparison with parnass We now compare the ability to predict RNA conformational switches of RNAbor and parnass. We have chosen to display the parnass distance plot of energy barrier versus morphological distance (?), but in all the below examples the distance plots using tree alignment or string edit distance showed similar results. The E.coli ho (host illing) mrna folds into two different conformations (?). The full length mrna folds into a stable structure involving a long-range interaction between the 5 and 3 -end. Degradation of the 3 -end leads to a conformational change as the stabilizing long-range interaction is broen. Here we have investigated the part of the mrna that undergo a conformational change (as provided on the parnass webserver uni-bielefeld.de/parnass). For this RNA, both RNAbor and parnass detect the conformational switch, the RNAbor probability plot shows two distinct peas suggesting two alternative stable structures and the parnass plot show two clearly separated clusters, both suggesting that all the reasonably stable structures fall into one out of two conformations, see Figure??. Although both RNAbor and parnass suggest that the ho gene has two alternative structures, there are some uncertainties in the result. In the RNAbor density plot there are actually three peas (even though the third pea is significantly smaller than the other two), indicating that there might be more than two alternative structures. The 5 -untranslated (UTR) region of E.coli thim mrna undergo a change in structure, that is important for regulation (?). Both RNAbor and parnass indicate more than one single stable structure for the thim-leader. As can be seen from Figure?? there actually seem to be more than two alternative structures. However, the third structure seems to be less important (lower probability), and hence this RNA is predicted as a conformational switch by RNAbor. A comparison of how RNAbor and parnass perform on the example conformational switches available on the parnass website is summarized in Table??. We can of course also investigate a non switch with RNAbor and parnass. In Figure?? an example Hammerhead I is shown. The Hammerhead I structure is an example of a not very well-defined structure (?). In the RNAbor plot there are plenty of peas, not a single pea as is the case for well-defined structures such as mirna (see Figure??) and not two distinct peas as for a conformational switch. Also parnass shows one large cloud in the distance plot indicating that Hammerhead I do not have two separated alternative structures. To investigate how many false predictions we get with our switch prediction method based on the RNAbor computation we have investigated a set of RNAs assumed not to have alternative structures. We have included both RNAs with well-defined and not so well-defined structures. See Table?? for the results. 9

10 (a) RNAbor density plot. (b) parnass distance plot. (c) parnass validation plot. Figure 7: E.coli ho 3 Discussion In this paper, we present probability density graphs for a variety of conformational switches and for some non-switches. We additionally compare RNAbor with the web server parnass (?), which latter uses a heuristic to determine whether there appear to be two or more clusters of distinct secondary structures for a given RNA sequence. In (?) Ding and Lawrence describe how to sample RNA secondary structures after first computing the Boltzmann partition function using McCasill s algorithm (?). By computing base pair distance between each two sampled structures, Ding et al. (?) subsequently compute centroids of clusters produced by hierarchical clustering. This Boltzmann centroid method of Ding et al. (?) provides a computational means of probing the landscape of the low energy ensemble at thermodynamic equilibrium. Since parnass calls the Vienna RNA Pacage program RNAsubopt, it requires a userdefined bound E in order to generate all secondary structures within E cal/mol of the minimum free energy. In contrast to parnass and the Boltzmann centroid method, our algorithm RNAbor directly computes the Boltzmann partition function for all secondary structures of a given RNA sequence which differ by exactly δ base pairs, for all δ less than a user-defined bound. Potential applications of RNAbor which will be pursued in future wor include the following. 10

11 (a) RNAbor density plot. (b) The parnass distance plot. Figure 8: E.coli thim-leader (a) RNAbor density plot. (b) The parnass distance plot. Figure 9: Hammerhead I Since RNAbor allows one to distinguish whether the given RNA nucleotide sequence s has a single pronounced well of attraction around a given secondary structure S 0 of s, it may be possible to use RNAbor to detect situations where the native secondary structure, as determined by X-ray crystallography, is different than that proposed by mfold and RNAfold. The idea would be to determine if there is no pea around δ = 0, when S 0 is taen to be the MFE structure. Figure??(a) shows an interesting example where the RNA seems to have more than one alternative structure. Does this RNA have more than two alternative structures? Is it the case that the MFE structure is not biologically functional? (In this example, the other two alternative structures seem to be probable). Using RNAbor, we can determine the minimum free energy structures over all δ-neighbors, where the Boltzmann probability p δ is high. Ultimately chemical probing experiments might determine whether these MFE δ structures are the preferred biologically active structure. RNAbor is a useful complement to already existing tools for detecting putative conformational switches. Unlie parnass, the number of structures 11

12 to be analyzed and the maximum allowable free energy difference from the MFE structure need not be decided in advance. (These can change the parnass result quite dramatically). Depending on the number of structures to be analyzed and the energy bound, parnass can tae an exponential amount of time, in contrast to O(m n 3 ) time for RNAbor to compute N δ, Z δ and MFE δ. As for any bioinformatics software, it will be necessary to perform experimental validation of predictions made by RNAbor. In future wor, we intend to include user-defined constraints, which allow the user to require all investigated structures to contain certain specified base pairs and for certain specified nucleotides to remain unpaired. This will allow RNAbor to be used together with chemical probing experiments to determine biologically active conformers. 4 Materials and methods Given an RNA nucleotide sequence s, consider a fixed secondary structure S 0 of s. Here S 0 could be the MFE structure of s, it could be the secondary structure obtained from the 3-dimensional X-ray conformation or by comparative sequence analysis, or it could be an arbitrary intermediate structure (such intermediate forms may play a biologically important role, as in viroids). Recall that a secondary structure S of s is a δ-neighbor of (s, S 0 ) if S and S 0 differ by exactly δ base pairs. In this section, we describe how to efficiently compute the number N δ of δ-neighbors, the partition function Z δ for δ-neighbors, and the minimum free energy structure MFE δ over all δ-neighbors. 4.1 The number of δ-neighbors of a fixed secondary structure Let s = a 1,..., a n denote an RNA sequence, i.e. a sequence of letters in the alphabet of nucleotides {A, C, G, U}. A secondary structure S on s is a set of base-pairs (i, j), where θ 0 is an integer (corresponding to hairpin loop size, which we usually set to 3) and 1 i i + θ < j n, such that if (, l) is a base pair, then = i l = j and i < < l i < l < j. We say that S is compatible with s if for every base pair (i, j) in S the pair a i a j is contained in the set B = {AU, UA, GC, CG, GU, UG} (i.e. the set of Watson-Cric basepairings together with wobbles). Given two secondary structures S, T on s, we define the base-pair distance d BP between S and T to be the number of base-pairs that they have that are not in common, i.e. d BP (S, T ) = S T = S T S T, For the rest of this section, we consider both s as well as the secondary structure S on s to be fixed. We now provide recursions for determining the number of secondary structures T compatible with s that are at precisely basepair distance δ to S. Let S [i,j] denote the restriction of S to interval [i, j] of s, that is, the set of base pairs S [i,j] = {(, l) : i < l j, (, l) S}. 12

13 A secondary structure T [i,j] on s is a δ-neighbor of S [i,j] if d BP (S [i,j], T [i,j] ) = δ. For all 0 δ m, and all 1 i j n, let N δ i,j (s, S) denote the number of secondary structures T [i,j] compatible with s such that d BP (S [i,j], T [i,j] ) = δ. In the following we may omit the sequence s and secondary structure S in our notation since these are fixed. In particular, we put N δ i,j = Nδ i,j (s, S). N δ i,j is recursively computed. The initial conditions for computing N δ i,j are then given by N 0 i,j = 1, for i < j (1) since the only 0-neighbor to a structure is the structure itself, and N δ i,j = 0, for δ > 0, j i + θ, (2) since the empty structure is the only possible structure for a sequence shorter than θ + 2 nucleotides, there are no δ-neighbors for δ > 0. The recursion used to compute N δ i,j for δ > 0 and j > i + θ is N δ i,j = N δ b0 i,j 1 + N w i, 1N w +1,j 1, (3) a a j B, i <j w+w =δ b where b 0 = 1 if j is base pairing in S [i,j] and 0 otherwise and b = d BP (S [i,j], S [i, 1] S [+1,l 1] {(, l)}). This holds since a secondary structure T [i,j] on [i, j] that is a δ-neighbor of S [i,j] either nucleotide j is unpaired in [i, j] or it is paired to a nucleotide, such that i < j. In this latter case it is enough to study the smaller sequence segments [i, 1] and [ + 1, j 1] noting that, except for (, j), base-pairs outside of these regions are not allowed. In addition, for d BP (S [i,j], T [i,j] ) = δ to be fulfilled it is necessary for w + w = δ b to hold, where w = d BP (S [i, 1], T [i, 1] ) and w = d BP (S [+1,j 1], T [+1,j 1] ), since b is the number of base pairs that differ between S [i,j] and a structure T [i,j], due to the introduction of the base pair (, j). Pseudocode for computing N δ i,j for values of δ between 0 and m is given in Appendix??. The algorithm runs in time O(mn 3 ) and space O(mn 2 ) where, as defined above, n is the length of s and m is the maximum value of δ. 4.2 analogue In this section, we explain how to extend our approach to computing N δ i,j to one for computing the partition function contribution of the set of structures compatible with a given RNA sequence s at a fixed base-pair distance δ from an RNA structure S compatible with s. This allows us to compute the probability of finding a structure compatible with s at distance δ from S. It is straight-forward to extend the previous approach to compute partition functions for the Nussinov-Jacobson energy model. In particular, by simply replacing recursion (??) with N δ i,j = N δ b0 i,j 1 + a a j B, i <j w+w =δ b N w i, 1N w +1,j 1e E bp(,l)/rt, (4) where E bp (, l) is the energy of the base-pair (, l), R is the gas constant, and T is the temperature, we can compute the partition function contribution of 13

14 structures at a given base-pair distance δ. The base-pair energy E bp (, l) taes the value 1 if a a l B and 0 otherwise. Note that the energy contribution can be altered for different base-pairs (e.g. 3 for GC, 2 for AU and 1 for GU are weights used in (?)). Employing a substantially more complicated algorithm, similar to the dynamic programming calculation of the partition function described in (?), the partition function contributions can also be computed according to the Turner energy model. In the Turner energy model a secondary structure is decomposed into loops, as described in (?), and the energy is computed as a sum of the energy contributions of the loops. A -loop consists of 1 base pairs (excluding the closing base pair) and u unpaired bases. The energies of 1-loops (hairpins) and 2-loops (stacs if u = 0, bulges or interior loops if u > 0) are experimentally determined (??) and are dependent on and u as well as the RNA sequence. In the Turner model the energies for multi-loops ( > 2) are generally determined by the approximate linear model E M = a + b( 1) + cu, where a, b and c are constants. As before, from now on we regard s and S to be a fixed RNA sequence with compatible secondary structure S. The partition function for s is then defined as Z = T e E T /RT, where the sum is taen over all structures T compatible with s, and E T is the energy of the structure T. We aim to compute the restriction Z δ = Z δ 1,n = Z δ 1,n(s, S), that is, the sum of e E T /RT taen over all structures T that are compatible with s and at base-pair distance δ from S. The probability for finding a structure at a distance δ from S is then given by p δ = Z δ /Z. As with the usual McCasill partition function calculations (?), in the dynamic programming we use three matrices Z, ZB and ZM for recursively computing Z δ instead of the single matrix N used for computing N δ in the previous section. In particular, for the sequence segment [i, j] of s, define Z δ i,j = e E T [i,j] /RT, where the sum is over all structures T [i,j] compatible with s and such that d BP (S [i,j], T [i,j] ) = δ. Also, define the restricted partition function ZB δ i,j as the sum of e E T [i,j] /RT taen over all structures T [i,j] such that (i, j) T [i,j], and ZM δ i,j, which is the partition function contribution if the sequence segment [i, j] is part of a multi-loop. The matrices Z, ZB and ZM are filled using the following three recursions. To compute Z we use Z δ i,j = Z δ b0 i,j 1 + a a l B, i <j Z w i, 1ZB w,je Ed/RT, w+w =δ d 1 where E d is the energy contribution due to dangling ends (energy contributions from single bases stacing on adjacent base-pairs) and closing AU base-pairs (since a non GC base-pair closing a stem has a destabilizing effect), and d 1 = d BP (S [i,j], S [i, 1] S [,l] ). Note that the first term of this recursion corresponds to the case where j is unpaired (and hence has no energy contribution) in [i, j]. The second term includes all other structures on [i, j]. The sum is taen over all possible base pairs (, j) with i < j. If (, j) is a base-pair the partition function for [, j] is given by ZB w,j, the partition function for [i, 1] is given by Z w i, 1. 14

15 We compute ZB using the recursion ZB δ i,j = (d BP (S [i,j], {(i, j)}) δ)e E(i,j)/RT + + ZB δ d2,l e E(i,j,,l)/RT + (5) a a j B, i<<l<j + a a j B, i<<l<j w+w =δ d 3 ZM w i+1, 1ZB w,le (a+b+c(j l 1))/RT, where E(i, j) is the energy of the hairpin loop with closing base pair (i, j), E(i, j,, l) is the energy of the stac, bulge or interior loop with the closing base pair (i, j) and the interior base pair (, l), d 2 = d BP (S [i,j], S [,l] {(i, j)}), and d 3 = d BP (S [i,j], S [i+1, 1] S [,l] {(i, j)}). Here, (x, y) is the Kronecer function, which equals 1 if x = y, and is otherwise 0. Note that since the above equation computes ZB δ i,j, it follows that (i, j) forms a base-pair in the neighboring structures T [i,j] (if this is not possible then ZB δ i,j = 0). The first term in the recursion taes care of the case where (i, j) is the only base pair on [i, j], i.e. (i, j) closes a hairpin loop. The second term handles the case where there is an interior loop (or a bulge or a stac) closed by (i, j) and (, l). The third term taes care of all the structures where (i, j) is closing a multi-loop. To reduce complexity of the algorithm the interior and bulge loop size can be limited to a maximum size of L, by requiring that l > j L in the above recursion. The final recursion, for computing ZM, is ZM δ i,j = ZM δ b0 i,j 1 e c/rt + a a j B, i <j ( ZB δ d4,j e (b+c( i))/rt + w+w =δ d 5 ZM w i, 1ZB w,je b/rt ), (6) where d 4 = d BP (S [i,j], S [,j] ) and d 5 = d BP (S [i,j], S [i, 1] S [,j] ). Note that since ZM δ i,j computes the partition function contribution under the assumption that [i, j] is part of a multi-loop, there will be exactly one stem-loop structure in this region (the ZB term) or more than one (the ZB-ZM term). Pseudocode for computing Z δ is given in Appendix??. The complexity is the same as for computing the number of δ-neighbors, O(mn 2 ) in space and O(mn 3 ) in time, if the size of internal loops and bulges are limited to a fixed length such as 30, following the convention of Vienna RNA Pacage. 4.3 Minimum free energy δ-neighbors Given an RNA nucleotide sequence s and secondary structure S 0, the minimum free energy δ-neighbor, denoted MFE δ, is that secondary structure S of s, which has base pair distance δ with S 0, and which has least free energy E δ among all such structures having base pair distance δ with S 0. Free energy is measured according to the Turner energy model (??), where our treatment of dangles follows that of Vienna RNA pacage with d2 option. 15

16 In this section, we describe a novel algorithm capable of computing the MFE δ models, for all δ. As in our partition function computation, the run time [resp. space requirement] to compute all MFE δ structures for δ m is O(m n 3 ) [resp. O(m n 2 )]. This algorithm is obtained from the algorithm in section?? essentially by replacing Boltzmann factor e E(S)/RT by free energy E(S) and by replacing the operations of addition [resp. multiplication] by minimization [resp. addition]. In future wor, we plan to analyze the structure morphological changes in proceeding from S 0 to MFE 0, MFE 1, MFE 2, etc. Such analysis could prove useful in conformational switch detection and other applications. Fix RNA nucleotide sequence s = a 1,..., a n and secondary structure S 0 of s. To compute E δ we use E δ i,j = min Eδ b0 i,j 1, min a a l B, i <j min E w w+w i, 1 + EB w,j + E d =δ d 1 where E d is the energy contribution due to dangling ends (energy contributions from single bases stacing on adjacent base-pairs) and closing AU base-pairs (since a non GC base-pair closing a stem has a destabilizing effect), and d 1 = d BP (S [i,j], S [i, 1] S [,l] ). Note that the first term of this recursion corresponds to the case where j is unpaired (and hence has no energy contribution) in [i, j]. The second term includes all other structures on [i, j]. The minimization is taen over all possible base pairs (, j) with i < j. If (, j) is a base-pair, then the minimum free energy of all w -neighbors of S 0 restricted to [, j] is given by EB w,j, while the mfe for all w-neighbors of S 0 restricted to [i, 1] is given by E w i, 1. We compute EB using the recursion EB δ i,j = min { (d BP (S [i,j], {(i, j)}) δ)e(i, j), min EB δ d2 a a,l + E(i, j,, l), j B, (7) i<<l<j min a a j B, i<<l<j min EM w w+w i+1, 1 + EB w,l + a + b + c(j l 1) =δ d 3, where E(i, j) is the energy of the hairpin loop with closing base pair (i, j), E(i, j,, l) is the energy of the stac, bulge or interior loop with the closing base pair (i, j) and the interior base pair (, l), d 2 = d BP (S [i,j], S [,l] {(i, j)}), and d 3 = d BP (S [i,j], S [i+1, 1] S [,l] {(i, j)}). Here, (x, y) is the Kronecer function, which equals 1 if x = y, and is otherwise 0. Note that since the above equation computes EB δ i,j, it follows that (i, j) forms a base-pair in the neighboring structures T [i,j] (if this is not possible then EB δ i,j = 0). The first term in the recursion taes care of the case where (i, j) is the only base pair on [i, j], i.e. (i, j) closes a hairpin loop. The second term handles the case where there is an interior loop (or a bulge or a stac) closed by (i, j) and (, l). The third term taes care of all the structures where (i, j) is closing a multi-loop. To reduce complexity of the algorithm the interior and bulge loop size can be limited to a maximum size of L, by requiring that l > j L in the above recursion. 16

17 The final recursion, for computing EM, is { EM δ i,j = min EM δ b0 i,j 1 + c, ( min a a j B, i <j EB δ d4,j + b + c( i), min w+w =δ d 5 EM w i, 1 + EB w,j + b (8) where d 4 = d BP (S [i,j], S [,j] ) and d 5 = d BP (S [i,j], S [i, 1] S [,j] ). Note that since EM δ i,j computes the minimum free energy of δ-neighbors of S 0 restricted to [i, j], under the assumption that [i, j] is part of a multi-loop, this minimization is made over one stem-loop structure in this region (the EB term) and structures having more than one (the EM+EB term). For reasons of space, the pseudocode for computing E δ is not presented; given our previous description of E δ and the pseudocode for computing the partition function Z δ, appearing in the appendix, the reader will have no difficulty to reconstruct the pseudocode for E δ. 5 Acnowledgements Research of P.C. was partially supported by National Science Foundation DBI , which additionally supported some travel of E.F. All three authors would lie to than Elena Rivas, Eric Westhof and funding agencies for organizing the meeting RNA-2006 in Benasque, Spain, in July 2006, where some of this wor was carried out. Thans as well to Yann Ponty, for reading the manuscript and telling us of his algorithm to extract maximal planar secondary structures, details of which are forthcoming. )}, 17

18 Table 1: RNAbor and parnass predictions of positive examples provided on the parnass website and a set of negative examples. The parnass predictions for the positive examples are presented as they were in (?) and as we manually have interpreted the parnass plots (using default settings) ourselves. For the negative examples we present our own interpretations. The RNAbor results are automatic/manual. A question mar indicates that a decision could not be made, or was made with uncertainty. RNA RNAbor (auto/man) parnass (auto/man) Positive examples from parnass website 47 nt. switch (AE / ) + / + / - α operon mrna (E.coli) / / 3 -UTR of AMV RNA / / Attenuator + / + b + / + a,b 5 -UTR of btub mrna (E.coli) + / + a + / + dsra (E.coli) + / + + / + HDV ribozyme + / + + / + HIV-1 leader + / + / + ho (E.coli) + / + ± / + 5 -UTR of MS2 RNA (E. coli) / + / ribd leader (B.subtilis) / / S15 mrna (E.coli) / + / S-box leader mete (B.subtilis) / + / Spliced leader RNA (L.collosoma) b / ± + / + T4 td gene intron + b / + + / + Tetrahymena group I intron + / + + / + thim-leader RNA (E.coli) b / + / + ypaa leader (B.subtilis ) c / / Negative examples Hammerhead type I (L07513) / (?) Hammerhead type I (Z69690) b /? mirna, let-7 (AP001359) / mirna, mir-1 (AE003667) / 5S rrna (M16530) / trna (AE006699) c / ± (?) trna (X06054) + / + (few structures) U5 (X13427) / U5 (X15935) + / (partial?) a A switch under the assumption that the input (MFE) structure is one of the alternative structures. b More than two peas. c Two peas, but the second is much smaller than the MFE pea. 18

19 References M.D. Adams and et al. The genome sequence of Drosophila melanogaster. Science, 287(5461): , S.F. Altschul and B.W. Erison. Significance of nucleotide sequence alignments: A method for random sequence permutation that preserves dinucleotide and codon usage. Mol. Biol. Evol, 2(6): , A.R. Banerjee, J.A. Jaeger, and D.H. Turner. Thermal unfolding of a group I ribozyme: The low-temperature transition is primarily disruption of tertiary structure. Biochemistry, 32: , E. Bonnet, J. Wuyts, P. Rouze, and Y. Van de Peer. Evidence that microrna precursors, unlie other non-coding RNAs, have lower folding free energies than random sequences. Bioinformatics, 20(17): , C. Brown, B. Hendrich, J. Rupert, R. Lafreniere, Y. Xing, J. Lawrence, and H. Willard. The human XIST gene: Analysis of a 17 b inactive X-specific RNA that contains conserved repeats and is highly localized within the nucleus. Cell, 71: , J.J. Cannone, S. Subramanian, M.N. Schnare, J.R. Collett, L.M. D Souza, Y. Du, B. Feng, N. Lin, L.V. Madabusi, K.M. Muller, N. Pande, Z. Shang, N. Yu, and R.R. Gutell. The comparative rna web (crw) site: An online database of comparative sequence and structure information for ribosomal, intron, and other rnas. BioMed Central Bioinformatics, 3(2), Correction: BioMed Central Bioinformatics. 3(15). P. Clote, F. Ferre, E. Kranais, and D. Krizanc. Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency. RNA., 11: , P. Clote, F. Ferré, E. Kranais, and D. Krizanc. Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency. RNA, 11(5): , S. Commans and A. Böc. Selenocysteine inserting trnas: an overview. FEMS Microbiology Reviews, 23: , Y. Ding, C.Y. Chan, and C.E. Lawrence. RNA secondary structure by centroids in a Boltzmann weighted ensemble. RNA, 11(8): , Y. Ding and C.E. Lawrence. A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res., 31(24): , J.A. Doudna and T.R. Cech. The chemical repertoire of natural ribozymes. Nature, 418(6894): , C. Flamm, I.L. Hofacer, S. Mauer-Stroh, P.F. Stadler, and M. Zehl. Design of multi-stable RNA molecules. RNA, 7: , Thomas Franch, Alexander P. Gultyaev, and Kenn Gerdes. Programmed cell death by ho/so of plasmid r1: Processing at the ho mrna 3h-end triggers structural rearrangements that allow translation and antisense rna binding. J. Mol. Biol., 273:38 51,

20 Eva Freyhult, Paul P Gardner, and Vincent Moulton. A comparison of rna folding measures. BMC Bioinformatics, 6(1):241, R. Giegerich, D. Haase, and M. Rehmsmeier. Prediction and visualization of structural switches in RNA. Pac. Symp. Biocomput., 0: , R. Giegerich, B. Voss, and M. Rehmsmeier. Abstract shapes of RNA. Nucleic Acids Res., 32(16): , S. Griffiths-Jones, A. Bateman, M. Marshall, A. Khanna, and S.R. Eddy. Rfam: an RNA family database. Nucleic Acids Res., 31(1): , J. Harborth, S. M. Elbashir, K. Vandenburgh, H. Manninga, S. A. Scaringe, K. Weber, and T. Tuschl. Sequence, chemical, and structural variation of small interfering RNAs and short hairpin RNAs and the effect on mammalian gene silencing. Antisense Nucleic Acid Drug Dev., 13:83 106, A. Ke, K. Zhou, F. Ding, J.H. Cate, and J.A. Doudna. A conformational switch controls hepatitis delta virus ribozyme catalysis. Nature, 429: , L.P. Lim, M.E. Glasner, S. Yeta, C.B. Burge, and D.P. Bartel. microrna genes. Science, 299(5612):1540, Vertebrate D. H. Mathews and D. H. Turner. Experimentally derived nearest-neighbor parameters for the stability of RNA three- and four-way multibranch loops. Biochemistry., 41: , D.H. Mathews, J. Sabina, M. Zuer, and H. Turner. Expanded sequence dependence of thermodynamic parameters provides robust prediction of RNA secondary structure. J. Mol. Biol., 288: , D.H. Matthews, J. Sabina, M. Zuer, and D.H. Turner. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol., 288: , J. S. McCasill. The equilibrium partition function and base pair binding probabilities for RNA secondary structures. Biopolymers, 29: , S. Moon, Y. Byun, H.-J. Kim, S. Jeong, and K. Han. Predicting genes expressed via 1 and +1 frameshifts. Nucleic Acids Res., 32(16): , V. Moulton, M. Zuer, M. Steel, R. Pointon, and D. Penny. Metrics on RNA secondary structures. J. Comput. Biol., 7: , R. Nussinov and A. B. Jacobson. Fast algorithm for predicting the secondary structure of single-stranded RNA. Proc. Natl. Acad. Sci. U.S.A., 77: , R. Penchovsy and R.R. Breaer. Computational design and experimental validation of oligonucleotide-sensing allosteric ribozymes. Nature Biotechnology, 23(11), E. Rivas and S. R. Eddy. Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs. Bioinformatics., 16: ,

21 P.J. Schlax, K.A. Xavier, T.C. Gluic, and D.E. Draper. Translational repression of the Escherichia coli alpha operon mrna: importance of an mrna conformational switch and a ternary entrapmentcomplex. J Biol Chem, 276: , P. Steffen, B.Voss, M. Rehmsmeier, J. Reeder, and R. Giegerich. RNAshapes: an integrated RNA analysis pacage based on abstract shapes. Bioinformatics, 22(4): , T. Tuschl. Functional genomics: RNA sets the standard. Nature, 421: , A.V. Uzilov, J.M. Keegan, and D.H. Mathews. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics, 7:173, Q. Vicens and T.R. Cech. Atomic level architecture of group I introns revealed. Trends Biochem Sci., 31(1):41 51, B. Voss, R. Giegerich, and M. Rehmsmeier. Complete probabilistic analysis of RNA shapes. BMC Biol., 4(5), B. Voss, C. Meyer, and R. Giegerich. Evaluating the predictability of conformational switching in RNA. Bioinformatics., 0:0, P. Walter and G. Blobel. Signal recognition particle contains a 7S RNA essential for protein translocation across the endoplasmic reticulum. Nature, 299(5885), S. Washietl, I.L. Hofacer, and P.F. Stadler. Fast and reliable prediction of noncoding RNAs. Proc. Natl. Acad. Sci. USA, 19: , J.S. Weinger, K.M. Parnell, S. Dorner, R. Green, and S.A. Strobel. Substrateassisted catalysis of peptide bond formation by the ribosome. Nature Structural & Molecular Biology, 11: , W. C. Winler, S. Cohen-Chalamish, and R. R. Breaer. An mrna structure that controls gene expression by binding FMN. Proc. Natl. Acad. Sci. U.S.A., 99: , C. Worman and A. Krogh. No evidence that mrnas have lower folding free energies than random sequences with the same dinucleotide distribution. Nucl. Acids. Res., 27: , S. Wuchty, W. Fontana, I. L. Hofacer, and P. Schuster. Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers, 49: , T. Xia, Jr. J. SantaLucia, M.E. Burard, R. Kierze, S.J. Schroeder, X. Jiao, C. Cox, and D.H. Turner. Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Cric base pairs. Biochemistry, 37: , M. Zuer. Prediction of RNA secondary structure by energy minimization. Methods. Mol. Biol., 25: ,

22 Michael Zuer and David Sanoff. RNA secondary structures and their prediction. Bulletin of Mathemetical Biology, 46: ,

23 A RNAbor and parnass plots The below figures show the RNAbor plot side by side with the parnass plot showing energy barrier versus morphological distance. All of the parnass figures are produced using the parnass webserver with the default settings (a energy threshold for the suboptimal structures of 2 cal/mol and a maximum number of structures of 50). A.1 Positive examples [][] [] [[][]] [][][] [][][][] Figure 10: 47 nucleotide example switch [][] [[][]] [][][] [][[][]] [[][]][] Figure 11: E.coli α operon mrna 23

24 [][][][][] [[][]][][][] [[][][]][][] [][][][[][]] [[][][][][]] Figure 12: 3 -UTR of AMV RNA [][] [] [[][]] [][][] [[][]][] Figure 13: Attenuator 24

25 [[][]][[][[][]]] [[][]][[[[][]][]][]] [][][[[[][]][]][]] [[][][[[[][]][]][]]] [[][]][[[][]][][]] Figure 14: 5 -UTR of E.coli btub mrna [][][] [][][][] [[][]][] [][[][]][] [][[][]] Figure 15: E.coli dsra 25

26 [][[][]] [[][[][]]] [[][[][][]]] [][[][]][] [][[][][]] Figure 16: HDV ribozyme [][[][[][]]] [][[[][[][]]][]] [[][][[[][[][]]][]]] [][[][[][][]]] [][[[][[][][]]][]] Figure 17: HIV-1 leader 26

27 [][][] [[][]] [][[][]][] [[][][]] [[[][]][]] Figure 18: E.coli ho [][] [][][] [[][]][] [][[][]] [][][][] Figure 19: 5 -UTR of MS2 RNA from E.coli 27

28 [[[[][[][]]][]][]] [[[[][][]][]][]] [[[[][[][]]][]][[][]]] [[[[][][]][]][[][]]] [][[[][[][][]][]][]] Figure 20: B.subtilis ribd leader [][] [][][] [[][]] [] Figure 21: E.coli S15 mrna [[][[][][]][]][] [][[][[][][]][]][] [[[][[][][]]][]][] [[][][[][[][]]]][] [[][[][[][][]][]]][] Figure 22: S-box leader of B.subtilis mete 28

29 [] [][] [[][]] [][][] [[][]][] Figure 23: Spliced Leader RNA from L.collosoma [[[][]][][[][]]] [[][]][][[][]] [[][[][[][]]]] [[][[][][[][]]]] [][[][[][]]] Figure 24: T4 td gene intron 29

98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006

98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006 98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006 8.3.1 Simple energy minimization Maximizing the number of base pairs as described above does not lead to good structure predictions.

More information

Computing the partition function and sampling for saturated secondary structures of RNA, with respect to the Turner energy model

Computing the partition function and sampling for saturated secondary structures of RNA, with respect to the Turner energy model Computing the partition function and sampling for saturated secondary structures of RNA, with respect to the Turner energy model J. Waldispühl 1,3 P. Clote 1,2, 1 Department of Biology, Higgins 355, Boston

More information

In Genomes, Two Types of Genes

In Genomes, Two Types of Genes In Genomes, Two Types of Genes Protein-coding: [Start codon] [codon 1] [codon 2] [ ] [Stop codon] + DNA codons translated to amino acids to form a protein Non-coding RNAs (NcRNAs) No consistent patterns

More information

Predicting RNA Secondary Structure

Predicting RNA Secondary Structure 7.91 / 7.36 / BE.490 Lecture #6 Mar. 11, 2004 Predicting RNA Secondary Structure Chris Burge Review of Markov Models & DNA Evolution CpG Island HMM The Viterbi Algorithm Real World HMMs Markov Models for

More information

Lab III: Computational Biology and RNA Structure Prediction. Biochemistry 208 David Mathews Department of Biochemistry & Biophysics

Lab III: Computational Biology and RNA Structure Prediction. Biochemistry 208 David Mathews Department of Biochemistry & Biophysics Lab III: Computational Biology and RNA Structure Prediction Biochemistry 208 David Mathews Department of Biochemistry & Biophysics Contact Info: David_Mathews@urmc.rochester.edu Phone: x51734 Office: 3-8816

More information

RNA Abstract Shape Analysis

RNA Abstract Shape Analysis ourse: iegerich RN bstract nalysis omplete shape iegerich enter of Biotechnology Bielefeld niversity robert@techfak.ni-bielefeld.de ourse on omputational RN Biology, Tübingen, March 2006 iegerich ourse:

More information

RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17

RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 Dr. Stefan Simm, 01.11.2016 simm@bio.uni-frankfurt.de RNA secondary structures a. hairpin loop b. stem c. bulge loop d. interior loop e. multi

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Computational Approaches for determination of Most Probable RNA Secondary Structure Using Different Thermodynamics Parameters

Computational Approaches for determination of Most Probable RNA Secondary Structure Using Different Thermodynamics Parameters Computational Approaches for determination of Most Probable RNA Secondary Structure Using Different Thermodynamics Parameters 1 Binod Kumar, Assistant Professor, Computer Sc. Dept, ISTAR, Vallabh Vidyanagar,

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri RNA Structure Prediction Secondary

More information

RNA Basics. RNA bases A,C,G,U Canonical Base Pairs A-U G-C G-U. Bases can only pair with one other base. wobble pairing. 23 Hydrogen Bonds more stable

RNA Basics. RNA bases A,C,G,U Canonical Base Pairs A-U G-C G-U. Bases can only pair with one other base. wobble pairing. 23 Hydrogen Bonds more stable RNA STRUCTURE RNA Basics RNA bases A,C,G,U Canonical Base Pairs A-U G-C G-U wobble pairing Bases can only pair with one other base. 23 Hydrogen Bonds more stable RNA Basics transfer RNA (trna) messenger

More information

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison 10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:

More information

GCD3033:Cell Biology. Transcription

GCD3033:Cell Biology. Transcription Transcription Transcription: DNA to RNA A) production of complementary strand of DNA B) RNA types C) transcription start/stop signals D) Initiation of eukaryotic gene expression E) transcription factors

More information

DNA/RNA Structure Prediction

DNA/RNA Structure Prediction C E N T R E F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 12 DNA/RNA Structure Prediction Epigenectics Epigenomics:

More information

Classified Dynamic Programming

Classified Dynamic Programming Bled, Feb. 2009 Motivation Our topic: Programming methodology A trade-off in dynamic programming between search space design and evaluation of candidates A trade-off between modifying your code and adding

More information

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: m Eukaryotic mrna processing Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: Cap structure a modified guanine base is added to the 5 end. Poly-A tail

More information

BCB 444/544 Fall 07 Dobbs 1

BCB 444/544 Fall 07 Dobbs 1 BCB 444/544 Required Reading (before lecture) Lecture 25 Mon Oct 15 - Lecture 23 Protein Tertiary Structure Prediction Chp 15 - pp 214-230 More RNA Structure Wed Oct 17 & Thurs Oct 18 - Lecture 24 & Lab

More information

Hairpin Database: Why and How?

Hairpin Database: Why and How? Hairpin Database: Why and How? Clark Jeffries Research Professor Renaissance Computing Institute and School of Pharmacy University of North Carolina at Chapel Hill, United States Why should a database

More information

DANNY BARASH ABSTRACT

DANNY BARASH ABSTRACT JOURNAL OF COMPUTATIONAL BIOLOGY Volume 11, Number 6, 2004 Mary Ann Liebert, Inc. Pp. 1169 1174 Spectral Decomposition for the Search and Analysis of RNA Secondary Structure DANNY BARASH ABSTRACT Scales

More information

Bioinformatics Advance Access published July 14, Jens Reeder, Robert Giegerich

Bioinformatics Advance Access published July 14, Jens Reeder, Robert Giegerich Bioinformatics Advance Access published July 14, 2005 BIOINFORMATICS Consensus Shapes: An Alternative to the Sankoff Algorithm for RNA Consensus Structure Prediction Jens Reeder, Robert Giegerich Faculty

More information

De novo prediction of structural noncoding RNAs

De novo prediction of structural noncoding RNAs 1/ 38 De novo prediction of structural noncoding RNAs Stefan Washietl 18.417 - Fall 2011 2/ 38 Outline Motivation: Biological importance of (noncoding) RNAs Algorithms to predict structural noncoding RNAs

More information

BIOINF 4120 Bioinforma2cs 2 - Structures and Systems -

BIOINF 4120 Bioinforma2cs 2 - Structures and Systems - BIOINF 4120 Bioinforma2cs 2 - Structures and Systems - Oliver Kohlbacher Summer 2014 3. RNA Structure Part II Overview RNA Folding Free energy as a criterion Folding free energy of RNA Zuker- SCegler algorithm

More information

Multiple Choice Review- Eukaryotic Gene Expression

Multiple Choice Review- Eukaryotic Gene Expression Multiple Choice Review- Eukaryotic Gene Expression 1. Which of the following is the Central Dogma of cell biology? a. DNA Nucleic Acid Protein Amino Acid b. Prokaryote Bacteria - Eukaryote c. Atom Molecule

More information

TRANSAT A Method for Detecting the Conserved Helices of Functional RNA Structures, Including Transient, Pseudo-Knotted and Alternative Structures

TRANSAT A Method for Detecting the Conserved Helices of Functional RNA Structures, Including Transient, Pseudo-Knotted and Alternative Structures TRANSAT A Method for Detecting the Conserved Helices of Functional RNA Structures, Including Transient, Pseudo-Knotted and Alternative Structures Nicholas J. P. Wiebe, Irmtraud M. Meyer* Centre for High-Throughput

More information

SA-REPC - Sequence Alignment with a Regular Expression Path Constraint

SA-REPC - Sequence Alignment with a Regular Expression Path Constraint SA-REPC - Sequence Alignment with a Regular Expression Path Constraint Nimrod Milo Tamar Pinhas Michal Ziv-Ukelson Ben-Gurion University of the Negev, Be er Sheva, Israel Graduate Seminar, BGU 2010 Milo,

More information

RNA Secondary Structure Prediction

RNA Secondary Structure Prediction RN Secondary Structure Prediction Perry Hooker S 531: dvanced lgorithms Prof. Mike Rosulek University of Montana December 10, 2010 Introduction Ribonucleic acid (RN) is a macromolecule that is essential

More information

Regulation of Gene Expression

Regulation of Gene Expression Chapter 18 Regulation of Gene Expression Edited by Shawn Lester PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley

More information

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA XIUFENG WAN xw6@cs.msstate.edu Department of Computer Science Box 9637 JOHN A. BOYLE jab@ra.msstate.edu Department of Biochemistry and Molecular Biology

More information

of all secondary structures of k-point mutants of a is an RNA sequence s = s 1,..., s n obtained by mutating

of all secondary structures of k-point mutants of a is an RNA sequence s = s 1,..., s n obtained by mutating BIOINFORMICS Vol. 00 no. 00 2005 Pages 1 10 Energy landscape of k-point mutants of an RN molecule P. Clote 1,2, J. Waldispühl 1,3,4,, B. Behzadi 3, J.-M. Steyaert 3, 1 Department of Biology, Higgins 355,

More information

Stable stem enabled Shannon entropies distinguish non-coding RNAs from random backgrounds

Stable stem enabled Shannon entropies distinguish non-coding RNAs from random backgrounds RESEARCH Stable stem enabled Shannon entropies distinguish non-coding RNAs from random backgrounds Open Access Yingfeng Wang 1*, Amir Manzour 3, Pooya Shareghi 1, Timothy I Shaw 3, Ying-Wai Li 4, Russell

More information

The Riboswitch is functionally separated into the ligand binding APTAMER and the decision-making EXPRESSION PLATFORM

The Riboswitch is functionally separated into the ligand binding APTAMER and the decision-making EXPRESSION PLATFORM The Riboswitch is functionally separated into the ligand binding APTAMER and the decision-making EXPRESSION PLATFORM Purine riboswitch TPP riboswitch SAM riboswitch glms ribozyme In-line probing is used

More information

A tutorial on RNA folding methods and resources

A tutorial on RNA folding methods and resources A tutorial on RNA folding methods and resources Alain Denise, LRI/IGM, Université Paris-Sud with invaluable help from Yann Ponty, CNRS/Ecole Polytechnique 1 Master BIBS 2014-2015 Goals To help your work

More information

arxiv: v1 [q-bio.bm] 16 Aug 2015

arxiv: v1 [q-bio.bm] 16 Aug 2015 Asymptotic connectivity for the network of RNA secondary structures. Clote arxiv:1508.03815v1 [q-bio.bm] 16 Aug 2015 Biology Department, Boston College, Chestnut Hill, MA 02467, clote@bc.edu Abstract Given

More information

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:

More information

Combinatorial approaches to RNA folding Part II: Energy minimization via dynamic programming

Combinatorial approaches to RNA folding Part II: Energy minimization via dynamic programming ombinatorial approaches to RNA folding Part II: Energy minimization via dynamic programming Matthew Macauley Department of Mathematical Sciences lemson niversity http://www.math.clemson.edu/~macaule/ Math

More information

RNA secondary structure prediction. Farhat Habib

RNA secondary structure prediction. Farhat Habib RNA secondary structure prediction Farhat Habib RNA RNA is similar to DNA chemically. It is usually only a single strand. T(hyamine) is replaced by U(racil) Some forms of RNA can form secondary structures

More information

Supporting Information

Supporting Information Supporting Information Wilson et al. 10.1073/pnas.0804276105 Fig. S1. Sites of oxazolidinone resistance mutations in bacteria and archaea. (a) Secondary structure of the peptidyltransferase ring of the

More information

Combinatorial approaches to RNA folding Part I: Basics

Combinatorial approaches to RNA folding Part I: Basics Combinatorial approaches to RNA folding Part I: Basics Matthew Macauley Department of Mathematical Sciences Clemson University http://www.math.clemson.edu/~macaule/ Math 4500, Spring 2015 M. Macauley (Clemson)

More information

Introduction to molecular biology. Mitesh Shrestha

Introduction to molecular biology. Mitesh Shrestha Introduction to molecular biology Mitesh Shrestha Molecular biology: definition Molecular biology is the study of molecular underpinnings of the process of replication, transcription and translation of

More information

Moments of the Boltzmann distribution for RNA secondary structures

Moments of the Boltzmann distribution for RNA secondary structures Bulletin of Mathematical Biology 67 (2005) 1031 1047 www.elsevier.com/locate/ybulm Moments of the Boltzmann distribution for RNA secondary structures István Miklós a, Irmtraud M. Meyer b,,borbála Nagy

More information

Revisiting the Central Dogma The role of Small RNA in Bacteria

Revisiting the Central Dogma The role of Small RNA in Bacteria Graduate Student Seminar Revisiting the Central Dogma The role of Small RNA in Bacteria The Chinese University of Hong Kong Supervisor : Prof. Margaret Ip Faculty of Medicine Student : Helen Ma (PhD student)

More information

RNA Catalysis, Structure and Folding Chair: E. Westhof RNA Meeting 2008 Berlin. Density courtesy David Shechner

RNA Catalysis, Structure and Folding Chair: E. Westhof RNA Meeting 2008 Berlin. Density courtesy David Shechner RNA Catalysis, Structure and Folding Chair: E. Westhof RNA Meeting 2008 Berlin Density courtesy David Shechner Catalysis implies very precise chemistry: Bringing several atoms at the correct distances

More information

Algorithms in Computational Biology (236522) spring 2008 Lecture #1

Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: 15:30-16:30/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office hours:??

More information

Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence

Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence Naoto Morikawa (nmorika@genocript.com) October 7, 2006. Abstract A protein is a sequence

More information

Grand Plan. RNA very basic structure 3D structure Secondary structure / predictions The RNA world

Grand Plan. RNA very basic structure 3D structure Secondary structure / predictions The RNA world Grand Plan RNA very basic structure 3D structure Secondary structure / predictions The RNA world very quick Andrew Torda, April 2017 Andrew Torda 10/04/2017 [ 1 ] Roles of molecules RNA DNA proteins genetic

More information

Chapter 1. A Method to Predict the 3D Structure of an RNA Scaffold. Xiaojun Xu and Shi-Jie Chen. Abstract. 1 Introduction

Chapter 1. A Method to Predict the 3D Structure of an RNA Scaffold. Xiaojun Xu and Shi-Jie Chen. Abstract. 1 Introduction Chapter 1 Abstract The ever increasing discoveries of noncoding RNA functions draw a strong demand for RNA structure determination from the sequence. In recently years, computational studies for RNA structures,

More information

Bioinformatics Chapter 1. Introduction

Bioinformatics Chapter 1. Introduction Bioinformatics Chapter 1. Introduction Outline! Biological Data in Digital Symbol Sequences! Genomes Diversity, Size, and Structure! Proteins and Proteomes! On the Information Content of Biological Sequences!

More information

Chapter 12. Genes: Expression and Regulation

Chapter 12. Genes: Expression and Regulation Chapter 12 Genes: Expression and Regulation 1 DNA Transcription or RNA Synthesis produces three types of RNA trna carries amino acids during protein synthesis rrna component of ribosomes mrna directs protein

More information

Lecture 8: RNA folding

Lecture 8: RNA folding 1/16, Lecture 8: RNA folding Hamidreza Chitsaz Colorado State University chitsaz@cs.colostate.edu Fall 2018 September 13, 2018 2/16, Nearest Neighbor Model 3/16, McCaskill s Algorithm for MFE Structure

More information

A two length scale polymer theory for RNA loop free energies and helix stacking

A two length scale polymer theory for RNA loop free energies and helix stacking A two length scale polymer theory for RNA loop free energies and helix stacking Daniel P. Aalberts and Nagarajan Nandagopal Physics Department, Williams College, Williamstown, MA 01267 RNA, in press (2010).

More information

Genetic transcription and regulation

Genetic transcription and regulation Genetic transcription and regulation Central dogma of biology DNA codes for DNA DNA codes for RNA RNA codes for proteins not surprisingly, many points for regulation of the process https://www.youtube.com/

More information

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs15.html Describing & Modeling Patterns

More information

Conserved RNA Structures. Ivo L. Hofacker. Institut for Theoretical Chemistry, University Vienna.

Conserved RNA Structures. Ivo L. Hofacker. Institut for Theoretical Chemistry, University Vienna. onserved RN Structures Ivo L. Hofacker Institut for Theoretical hemistry, University Vienna http://www.tbi.univie.ac.at/~ivo/ Bled, January 2002 Energy Directed Folding Predict structures from sequence

More information

proteins are the basic building blocks and active players in the cell, and

proteins are the basic building blocks and active players in the cell, and 12 RN Secondary Structure Sources for this lecture: R. Durbin, S. Eddy,. Krogh und. Mitchison, Biological sequence analysis, ambridge, 1998 J. Setubal & J. Meidanis, Introduction to computational molecular

More information

Lesson Overview. Gene Regulation and Expression. Lesson Overview Gene Regulation and Expression

Lesson Overview. Gene Regulation and Expression. Lesson Overview Gene Regulation and Expression 13.4 Gene Regulation and Expression THINK ABOUT IT Think of a library filled with how-to books. Would you ever need to use all of those books at the same time? Of course not. Now picture a tiny bacterium

More information

UNIT 5. Protein Synthesis 11/22/16

UNIT 5. Protein Synthesis 11/22/16 UNIT 5 Protein Synthesis IV. Transcription (8.4) A. RNA carries DNA s instruction 1. Francis Crick defined the central dogma of molecular biology a. Replication copies DNA b. Transcription converts DNA

More information

Lightweight Comparison of RNAs Based on Exact Sequence-Structure Matches

Lightweight Comparison of RNAs Based on Exact Sequence-Structure Matches Lightweight Comparison of RNAs Based on Exact Sequence-Structure Matches Steffen Heyne, Sebastian Will, Michael Beckstette, Rolf Backofen {heyne,will,mbeckste, backofen}@informatik.uni-freiburg.de Albert-Ludwigs-University

More information

RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory

RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory Garcia-Martin et al. BMC Bioinformatics 2016 17:424 DOI 10.1186/s12859-016-1280-6 SOFTWARE RNAdualPF: software to compute the dual partition function with sample applications in molecular evolution theory

More information

Finding Consensus Energy Folding Landscapes Between RNA Sequences

Finding Consensus Energy Folding Landscapes Between RNA Sequences University of Central Florida Electronic Theses and Dissertations Masters Thesis (Open Access) Finding Consensus Energy Folding Landscapes Between RNA Sequences 2015 Joshua Burbridge University of Central

More information

Computational Cell Biology Lecture 4

Computational Cell Biology Lecture 4 Computational Cell Biology Lecture 4 Case Study: Basic Modeling in Gene Expression Yang Cao Department of Computer Science DNA Structure and Base Pair Gene Expression Gene is just a small part of DNA.

More information

Introduction to Evolutionary Concepts

Introduction to Evolutionary Concepts Introduction to Evolutionary Concepts and VMD/MultiSeq - Part I Zaida (Zan) Luthey-Schulten Dept. Chemistry, Beckman Institute, Biophysics, Institute of Genomics Biology, & Physics NIH Workshop 2009 VMD/MultiSeq

More information

Genetic transcription and regulation

Genetic transcription and regulation Genetic transcription and regulation Central dogma of biology DNA codes for DNA DNA codes for RNA RNA codes for proteins not surprisingly, many points for regulation of the process DNA codes for DNA replication

More information

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p.110-114 Arrangement of information in DNA----- requirements for RNA Common arrangement of protein-coding genes in prokaryotes=

More information

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed

More information

RecitaLon CB Lecture #10 RNA Secondary Structure

RecitaLon CB Lecture #10 RNA Secondary Structure RecitaLon 3-19 CB Lecture #10 RNA Secondary Structure 1 Announcements 2 Exam 1 grades and answer key will be posted Friday a=ernoon We will try to make exams available for pickup Friday a=ernoon (probably

More information

Analytical Study of Hexapod mirnas using Phylogenetic Methods

Analytical Study of Hexapod mirnas using Phylogenetic Methods Analytical Study of Hexapod mirnas using Phylogenetic Methods A.K. Mishra and H.Chandrasekharan Unit of Simulation & Informatics, Indian Agricultural Research Institute, New Delhi, India akmishra@iari.res.in,

More information

Computational Biology and Chemistry

Computational Biology and Chemistry Computational Biology and Chemistry 33 (2009) 245 252 Contents lists available at ScienceDirect Computational Biology and Chemistry journal homepage: www.elsevier.com/locate/compbiolchem Research Article

More information

+ regulation. ribosomes

+ regulation. ribosomes central dogma + regulation rpl DNA tsx rrna trna mrna ribosomes tsl ribosomal proteins structural proteins transporters enzymes srna regulators RNAp DNAp tsx initiation control by transcription factors

More information

Chapter 15 Active Reading Guide Regulation of Gene Expression

Chapter 15 Active Reading Guide Regulation of Gene Expression Name: AP Biology Mr. Croft Chapter 15 Active Reading Guide Regulation of Gene Expression The overview for Chapter 15 introduces the idea that while all cells of an organism have all genes in the genome,

More information

Regulation of Transcription in Eukaryotes

Regulation of Transcription in Eukaryotes Regulation of Transcription in Eukaryotes Leucine zipper and helix-loop-helix proteins contain DNA-binding domains formed by dimerization of two polypeptide chains. Different members of each family can

More information

STRUCTURAL BIOINFORMATICS I. Fall 2015

STRUCTURAL BIOINFORMATICS I. Fall 2015 STRUCTURAL BIOINFORMATICS I Fall 2015 Info Course Number - Classification: Biology 5411 Class Schedule: Monday 5:30-7:50 PM, SERC Room 456 (4 th floor) Instructors: Vincenzo Carnevale - SERC, Room 704C;

More information

RNA Structure Prediction and Comparison. RNA folding

RNA Structure Prediction and Comparison. RNA folding RNA Structure Prediction and Comparison Session 3 RNA folding Faculty of Technology robert@techfak.uni-bielefeld.de Bielefeld, WS 2013/2014 Base Pair Maximization This was the first structure prediction

More information

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its

More information

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016 Boolean models of gene regulatory networks Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016 Gene expression Gene expression is a process that takes gene info and creates

More information

Rapid Dynamic Programming Algorithms for RNA Secondary Structure

Rapid Dynamic Programming Algorithms for RNA Secondary Structure ADVANCES IN APPLIED MATHEMATICS 7,455-464 I f Rapid Dynamic Programming Algorithms for RNA Secondary Structure MICHAEL S. WATERMAN* Depurtments of Muthemutics und of Biologicul Sciences, Universitk of

More information

The wonderful world of RNA informatics

The wonderful world of RNA informatics December 9, 2012 Course Goals Familiarize you with the challenges involved in RNA informatics. Introduce commonly used tools, and provide an intuition for how they work. Give you the background and confidence

More information

O 3 O 4 O 5. q 3. q 4. Transition

O 3 O 4 O 5. q 3. q 4. Transition Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in

More information

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University General informations (1) Office hours: by appointment Office: TR3018

More information

A Method for Aligning RNA Secondary Structures

A Method for Aligning RNA Secondary Structures Method for ligning RN Secondary Structures Jason T. L. Wang New Jersey Institute of Technology J Liu, JTL Wang, J Hu and B Tian, BM Bioinformatics, 2005 1 Outline Introduction Structural alignment of RN

More information

Flow of Genetic Information

Flow of Genetic Information presents Flow of Genetic Information A Montagud E Navarro P Fernández de Córdoba JF Urchueguía Elements Nucleic acid DNA RNA building block structure & organization genome building block types Amino acid

More information

CS681: Advanced Topics in Computational Biology

CS681: Advanced Topics in Computational Biology CS681: Advanced Topics in Computational Biology Can Alkan EA224 calkan@cs.bilkent.edu.tr Week 10 Lecture 1 http://www.cs.bilkent.edu.tr/~calkan/teaching/cs681/ RNA folding Prediction of secondary structure

More information

COMBINATORICS OF LOCALLY OPTIMAL RNA SECONDARY STRUCTURES

COMBINATORICS OF LOCALLY OPTIMAL RNA SECONDARY STRUCTURES COMBINATORICS OF LOCALLY OPTIMAL RNA SECONDARY STRUCTURES ÉRIC FUSY AND PETER CLOTE Abstract. It is a classical result of Stein and Waterman that the asymptotic number of RNA secondary structures is 1.104366

More information

CONTRAfold: RNA Secondary Structure Prediction without Physics-Based Models

CONTRAfold: RNA Secondary Structure Prediction without Physics-Based Models Supplementary Material for CONTRAfold: RNA Secondary Structure Prediction without Physics-Based Models Chuong B Do, Daniel A Woods, and Serafim Batzoglou Stanford University, Stanford, CA 94305, USA, {chuongdo,danwoods,serafim}@csstanfordedu,

More information

A rule of seven in Watson-Crick base-pairing of mismatched sequences

A rule of seven in Watson-Crick base-pairing of mismatched sequences A rule of seven in Watson-Crick base-pairing of mismatched sequences Ibrahim I. Cisse 1,3, Hajin Kim 1,2, Taekjip Ha 1,2 1 Department of Physics and Center for the Physics of Living Cells, University of

More information

Computational Methods For Analyzing Rna Folding Landscapes And Its Applications

Computational Methods For Analyzing Rna Folding Landscapes And Its Applications University of Central Florida Electronic Theses and Dissertations Doctoral Dissertation (Open Access) Computational Methods For Analyzing Rna Folding Landscapes And Its Applications 2012 Yuan Li University

More information

Markov Models & DNA Sequence Evolution

Markov Models & DNA Sequence Evolution 7.91 / 7.36 / BE.490 Lecture #5 Mar. 9, 2004 Markov Models & DNA Sequence Evolution Chris Burge Review of Markov & HMM Models for DNA Markov Models for splice sites Hidden Markov Models - looking under

More information

13.4 Gene Regulation and Expression

13.4 Gene Regulation and Expression 13.4 Gene Regulation and Expression Lesson Objectives Describe gene regulation in prokaryotes. Explain how most eukaryotic genes are regulated. Relate gene regulation to development in multicellular organisms.

More information

Structure-Based Comparison of Biomolecules

Structure-Based Comparison of Biomolecules Structure-Based Comparison of Biomolecules Benedikt Christoph Wolters Seminar Bioinformatics Algorithms RWTH AACHEN 07/17/2015 Outline 1 Introduction and Motivation Protein Structure Hierarchy Protein

More information

Clustering of RNA Secondary Structures with Application to Messenger RNAs

Clustering of RNA Secondary Structures with Application to Messenger RNAs doi:10.1016/j.jmb.2006.01.056 J. Mol. Biol. (2006) 359, 554 571 Clustering of RNA Secondary Structures with Application to Messenger RNAs Ye Ding 1 *, Chi Yu Chan 1 and Charles E. Lawrence 1,2 * 1 Wadsworth

More information

Lecture 18 June 2 nd, Gene Expression Regulation Mutations

Lecture 18 June 2 nd, Gene Expression Regulation Mutations Lecture 18 June 2 nd, 2016 Gene Expression Regulation Mutations From Gene to Protein Central Dogma Replication DNA RNA PROTEIN Transcription Translation RNA Viruses: genome is RNA Reverse Transcriptase

More information

ANALYZING MODULAR RNA STRUCTURE REVEALS LOW GLOBAL STRUCTURAL ENTROPY IN MICRORNA SEQUENCE

ANALYZING MODULAR RNA STRUCTURE REVEALS LOW GLOBAL STRUCTURAL ENTROPY IN MICRORNA SEQUENCE AALYZIG MODULAR RA STRUCTURE REVEALS LOW GLOBAL STRUCTURAL ETROPY I MICRORA SEQUECE Timothy I. Shaw * and Amir Manzour Institute of Bioinformatics, University of Georgia Athens,Ga 30605, USA Email: gatech@uga.edu,

More information

A Novel Statistical Model for the Secondary Structure of RNA

A Novel Statistical Model for the Secondary Structure of RNA ISBN 978-1-8466-93-3 Proceedings of the 5th International ongress on Mathematical Biology (IMB11) Vol. 3 Nanjing, P. R. hina, June 3-5, 11 Novel Statistical Model for the Secondary Structure of RN Liu

More information

Identification and annotation of promoter regions in microbial genome sequences on the basis of DNA stability

Identification and annotation of promoter regions in microbial genome sequences on the basis of DNA stability Annotation of promoter regions in microbial genomes 851 Identification and annotation of promoter regions in microbial genome sequences on the basis of DNA stability VETRISELVI RANGANNAN and MANJU BANSAL*

More information

Graph Alignment and Biological Networks

Graph Alignment and Biological Networks Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale

More information

GENE ACTIVITY Gene structure Transcription Transcript processing mrna transport mrna stability Translation Posttranslational modifications

GENE ACTIVITY Gene structure Transcription Transcript processing mrna transport mrna stability Translation Posttranslational modifications 1 GENE ACTIVITY Gene structure Transcription Transcript processing mrna transport mrna stability Translation Posttranslational modifications 2 DNA Promoter Gene A Gene B Termination Signal Transcription

More information

BME 5742 Biosystems Modeling and Control

BME 5742 Biosystems Modeling and Control BME 5742 Biosystems Modeling and Control Lecture 24 Unregulated Gene Expression Model Dr. Zvi Roth (FAU) 1 The genetic material inside a cell, encoded in its DNA, governs the response of a cell to various

More information

RNA and Protein Structure Prediction

RNA and Protein Structure Prediction RNA and Protein Structure Prediction Bioinformatics: Issues and Algorithms CSE 308-408 Spring 2007 Lecture 18-1- Outline Multi-Dimensional Nature of Life RNA Secondary Structure Prediction Protein Structure

More information

Lecture 8: RNA folding

Lecture 8: RNA folding 1/16, Lecture 8: RNA folding Hamidreza Chitsaz Colorado State University chitsaz@cs.colostate.edu Spring 2018 February 15, 2018 2/16, Nearest Neighbor Model 3/16, McCaskill s Algorithm for MFE Structure

More information

4. Why not make all enzymes all the time (even if not needed)? Enzyme synthesis uses a lot of energy.

4. Why not make all enzymes all the time (even if not needed)? Enzyme synthesis uses a lot of energy. 1 C2005/F2401 '10-- Lecture 15 -- Last Edited: 11/02/10 01:58 PM Copyright 2010 Deborah Mowshowitz and Lawrence Chasin Department of Biological Sciences Columbia University New York, NY. Handouts: 15A

More information

Quantitative modeling of RNA single-molecule experiments. Ralf Bundschuh Department of Physics, Ohio State University

Quantitative modeling of RNA single-molecule experiments. Ralf Bundschuh Department of Physics, Ohio State University Quantitative modeling of RN single-molecule experiments Ralf Bundschuh Department of Physics, Ohio State niversity ollaborators: lrich erland, LM München Terence Hwa, San Diego Outline: Single-molecule

More information