Extending the hypergraph analogy for RNA dynamic programming

Size: px
Start display at page:

Download "Extending the hypergraph analogy for RNA dynamic programming"

Transcription

1 Extending the hypergraph analogy for RN dynamic programming Yann Ponty Balaji Raman édric Saule Polytechnique/NRS/INRI MIB France Yann Ponty, Balaji Raman, édric Saule

2 RN Folding RN = Biopolymer composed of nucleotides,,, and : denosine, : ytosine, : uanine et : racil anonical base-pairs / / / Watson/rick Wobble RN folding = Stochastic continuous process directed by (resulting) the pairing of nucleotides. nderstanding RN folding is a key step toward understanding and predicting its function(s). Yann Ponty, Balaji Raman, édric Saule

3 RN structure(s) Three levels of representation: Primary structure Secondary structure Tertiary structures Source: 5s rrn (PDB K73:B) Well, almost... Yann Ponty, Balaji Raman, édric Saule

4 RN structure(s) Three levels of representation: Primary structure Secondary + structure Tertiary structures Source: 5s rrn (PDB K73:B) Well, almost... Yann Ponty, Balaji Raman, édric Saule

5 Discarded by secondary structure Non-canonical base-pairs ny basepair other than {(-), (-), (-)} Or interacting using a non-standard edge/orientation (W/W-is) [LW0]. / canonical pair (W/W-is) non-canonical pair (Sugar/W-Trans) Pseudoknots Pseudoknots within a group I Ribozyme (PDBID: Y0Q:) More expressive model, but ab initio folding with pseudoknots: NP-omplete [LP00]... yet polynomial for restricted classes [DR + 04]. Yann Ponty, Balaji Raman, édric Saule

6 Our objective(s) uthors omplexity uthors omplexity Lyngso and Pedersen O(n 5 ) ao and hen O(n 6 ) Reeder and iegerich O(n 4 ) Dirk and Pierce O(n 5 ) kutso and emara O(n 5 ) Rivas and Eddy O(n 6 ) hen, Jabbari and ondon O(n 5 ) Many DP algorithms proposed for folding with restricted pseudoknots. Decompositions induce high complexities and/or are ambiguous, prohibiting ensemble-based approaches: Partition function, BP prob.... Is it possible to come up with highly expressive, unambiguous decompositions? Our programme To unify DP RN algorithms into a single abstract framework. To reformulate ensemble-based applications within this framework. Improve existing algorithms by working on their decomposition at an abstract level. May benet to RN-RN interaction prediction, parameterized approaches... Yann Ponty, Balaji Raman, édric Saule

7 DP Example: Nussinov/Jacobson decomposition Nearest-neighbor model Free-energy = Weighted sum over base-pairs oal: iven RN sequence ω, nd Minimum Free Energy (MFE) secondary structure compatible with base-pairing. i j = i i+ j + θ i k j N i,t = 0, t [i, i + θ] N i+,j N i,j = min j min k=i+θ+ Eω i,ω + N k i+,k + N k+,j i non apparié i apparié à k E, = E, = 3.0 kal mol, E, = E, = 2.0 kal mol, and E, = E, =.0 kal mol. Once the minimal energy is gured out, the corresponding secondary structure is obtained through a backtrack procedure. Yann Ponty, Balaji Raman, édric Saule

8 More complex/realistic example Turner model Free-energy = Weighted sum over loops E H (i, j) M E i,j = S (i, j) + M i+,j min Min i,j (E BI (i, i, j, j) + M i,j ) a + c + Min k (M i+,k + M k,j ) { M i,j = Min k min (Mi,k, b(k )) + M } k,j M { i,j = Min k b + M i,j, c + M } i,j EH(i, j): Energy of terminal loop closed by base-pair (i, j) EBI (i, j): Energy of bulge/internal loop renement closed by bp (i, j) ES (i, j): Stacking energy (i, j)/(i +, j ) a,c,b: Penalties for multiple loops, helix and unpaired base in multiloop. Yann Ponty, Balaji Raman, édric Saule

9 Beyond minimization dditional motivation: o beyond energy minimization. urrent driving hypothesis = Boltzmann ensemble of low energy Functional folding? Ill-dened folding: mrn? Bistable RN Kinetics? Decompositions will have to be unambiguous... Yann Ponty, Balaji Raman, édric Saule

10 Dynamic programming Starting from an instance, or problem: Search space: May depend on instance Objective function: Dened on search space, may depend on instance Dynamic programming equation: Relates value of objective function for problem to that of its sub-problems (substructure property), relying on a decomposition of search space ompute obj. fun. for sub-problems (se backtrack to build solution). Nussinov example: Search space: Secondary structures inducing valid bps. Objective function: Free-energy DP equation/decomposition: MFE obtained by minimizing energy on subsequence(s) lternative view: DP equation generates search space ND implements a specic application. Detach two conceptually dierent entities. Yann Ponty, Balaji Raman, édric Saule

11 Existing DP abstract frameworks Parsing approaches Search space modeled by context-free grammar(s) (F). (Multitape)-attribute grammars (Lefebvre, Waldispühl et al): ompute and minimize score based on attribute algebra. Pros: aptures simple pseudoknots through synchronized multiple tapes. ons: Designed for optimization. Multi-tapes can be overkill for almost F. lgebraic Dynamic Programming (iegerich et al) Pros: dresses general applications thanks to an algebraic approach. ons: Pseudoknots require hacking the formalism. Other Hypergraphs (..., Roytberg/Finkelstein) Pros: Very exible. Mild inuence of order. ny F can be transformed into equivalent hypergraph. ons: No generic implementation (yet!). Yann Ponty, Balaji Raman, édric Saule

12 Existing DP abstract frameworks Parsing approaches Search space modeled by context-free grammar(s) (F). (Multitape)-attribute grammars (Lefebvre, Waldispühl et al): ompute and minimize score based on attribute algebra. Pros: aptures simple pseudoknots through synchronized multiple tapes. ons: Designed for optimization. Multi-tapes can be overkill for almost F. lgebraic Dynamic Programming (iegerich et al) Pros: dresses general applications thanks to an algebraic approach. ons: Pseudoknots require hacking the formalism. Other Hypergraphs (..., Roytberg/Finkelstein) Pros: Very exible. Mild inuence of order. ny F can be transformed into equivalent hypergraph. ons: No generic implementation (yet!). Yann Ponty, Balaji Raman, édric Saule

13 Existing DP abstract frameworks Parsing approaches Search space modeled by context-free grammar(s) (F). (Multitape)-attribute grammars (Lefebvre, Waldispühl et al): ompute and minimize score based on attribute algebra. Pros: aptures simple pseudoknots through synchronized multiple tapes. ons: Designed for optimization. Multi-tapes can be overkill for almost F. lgebraic Dynamic Programming (iegerich et al) Pros: dresses general applications thanks to an algebraic approach. ons: Pseudoknots require hacking the formalism. Other Hypergraphs (..., Roytberg/Finkelstein) Pros: Very exible. Mild inuence of order. ny F can be transformed into equivalent hypergraph. ons: No generic implementation (yet!). Yann Ponty, Balaji Raman, édric Saule

14 Hypergraphs Hypergaphs generalize directed graphs to arcs of arbitrary in/out degrees. Denition (Hypergraph) directed hypergaph H is a couple (V, E) such that: V is a set of vertices E is a set of hyperarcs e = (t(e) h(e)) such that t(e), h(e) E Forward hypergraphs, or F-graphs, are hypergraphs whose arcs have ingoing degree exactly. Yann Ponty, Balaji Raman, édric Saule

15 Hypergraphs Hypergaphs generalize directed graphs to arcs of arbitrary in/out degrees. Denition (Hypergraph) directed hypergaph H is a couple (V, E) such that: V is a set of vertices E is a set of hyperarcs e = (t(e) h(e)) such that t(e), h(e) E Forward hypergraphs, or F-graphs, are hypergraphs whose arcs have ingoing degree exactly. Yann Ponty, Balaji Raman, édric Saule

16 Weight: Score: 3 45 Weight: Score: 32 Weight: Score: 88 Weight: Score: 6 Weight: Score: F-graph ll F-paths starting from vertex 7 3 Denition (F-path) F-path is a tree having root s V, whose children are F-paths built from the outgoing vertices of some arc e = (s t) E. Remark: Vertices of out degree 0 (t = ) provide a terminal case to the above recursive denition. F-graph is independent i any arc is present at most once in each F-path. numerical value fonction π : E R can be assigned to each arc e E: Weight of a path is the product of its arcs' values Score of a path is the sum of its arcs' values Yann Ponty, Balaji Raman, édric Saule

17 Weight: Score: 3 45 Weight: Score: 32 Weight: Score: 88 Weight: Score: 6 Weight: Score: F-graph ll F-paths starting from vertex 7 3 Denition (F-path) F-path is a tree having root s V, whose children are F-paths built from the outgoing vertices of some arc e = (s t) E. Remark: Vertices of out degree 0 (t = ) provide a terminal case to the above recursive denition. F-graph is independent i any arc is present at most once in each F-path. numerical value fonction π : E R can be assigned to each arc e E: Weight of a path is the product of its arcs' values Score of a path is the sum of its arcs' values Yann Ponty, Balaji Raman, édric Saule

18 Weight: Score: 3 45 Weight: Score: 32 Weight: Score: 88 Weight: Score: 6 Weight: Score: F-graph ll F-paths starting from vertex 7 3 Denition (F-path) F-path is a tree having root s V, whose children are F-paths built from the outgoing vertices of some arc e = (s t) E. Remark: Vertices of out degree 0 (t = ) provide a terminal case to the above recursive denition. F-graph is independent i any arc is present at most once in each F-path. numerical value fonction π : E R can be assigned to each arc e E: Weight of a path is the product of its arcs' values Score of a path is the sum of its arcs' values Yann Ponty, Balaji Raman, édric Saule

19 Weight: Score: 3 45 Weight: Score: 32 Weight: Score: 88 Weight: Score: 6 Weight: Score: F-graph ll F-paths starting from vertex 7 3 Denition (F-path) F-path is a tree having root s V, whose children are F-paths built from the outgoing vertices of some arc e = (s t) E. Remark: Vertices of out degree 0 (t = ) provide a terminal case to the above recursive denition. F-graph is independent i any arc is present at most once in each F-path. numerical value fonction π : E R can be assigned to each arc e E: Weight of a path is the product of its arcs' values Score of a path is the sum of its arcs' values Yann Ponty, Balaji Raman, édric Saule

20 Basic algorithms H = (s 0, V, E, π): acyclic hypergraph s 0: Initial node π: value function Some questions naturally arise: What is the (min/max)imal score m s of an F-path starting from s V? omplexities: Θ( E + V ) time/θ( V ) memory. What is the number n s of F-paths starting from s V? omplexities: Θ( E + V ) time/θ( V ) memory. What is the total weight w s of all F-paths starting from s V? omplexities: Θ( E + V ) time/θ( V ) memory. m s = min e=(s t) E ( w(e) + u t m u ) s w(e) w(e ) w(e ) t t + t 2 Yann Ponty, Balaji Raman, édric Saule

21 Basic algorithms H = (s 0, V, E, π): acyclic hypergraph s 0: Initial node π: value function Some questions naturally arise: What is the (min/max)imal score m s of an F-path starting from s V? omplexities: Θ( E + V ) time/θ( V ) memory. What is the number n s of F-paths starting from s V? omplexities: Θ( E + V ) time/θ( V ) memory. What is the total weight w s of all F-paths starting from s V? omplexities: Θ( E + V ) time/θ( V ) memory. t n s = (s t) E u t n u s t t 2 Yann Ponty, Balaji Raman, édric Saule

22 Basic algorithms H = (s 0, V, E, π): acyclic hypergraph s 0: Initial node π: value function Some questions naturally arise: What is the (min/max)imal score m s of an F-path starting from s V? omplexities: Θ( E + V ) time/θ( V ) memory. What is the number n s of F-paths starting from s V? omplexities: Θ( E + V ) time/θ( V ) memory. What is the total weight w s of all F-paths starting from s V? omplexities: Θ( E + V ) time/θ( V ) memory. t w s = w(e) e=(s h(e)) E s h(e) w s s w(e) w(e ) w(e ) t t 2 Yann Ponty, Balaji Raman, édric Saule

23 Denition (Weighted distribution) ssume a weighted, Boltzmann-like, distribution on the set T of F-Paths: e p P(p π) = π(e), p T. w s0 Ensemble related questions arise: How to generate a random F-path p from T w.r.t. weighted distribution? omplexities: Θ( E + V ) time/θ( V ) memory precomputation + O( p + e p h(e) ) time generation. What is the probability that a given arc be present in a random F-Path? Inside/outside algorithm. Distribution of additive features lgorithm: hoose e i = (s ti ) with probability p s,i such that: p s,i = w(e) s w t s w s Iterate the process over all ti 's. s w(e) w(e ) w(e ) t t t 2 Yann Ponty, Balaji Raman, édric Saule

24 Denition (Weighted distribution) ssume a weighted, Boltzmann-like, distribution on the set T of F-Paths: e p P(p π) = π(e), p T. w s0 Ensemble related questions arise: How to generate a random F-path p from T w.r.t. weighted distribution? omplexities: Θ( E + V ) time/θ( V ) memory precomputation + O( p + e p h(e) ) time generation. What is the probability that a given arc be present in a random F-Path? Inside/outside algorithm. Distribution of additive features Yann Ponty, Balaji Raman, édric Saule

25 Inside/outside algorithm F-path using a given arc e (s t ) Inside: Some F-path built from t Distinguished arc e Outside: ontext where e is used: Path p = (s 0, s,..., s ) using F-arcs (e,..., e k ) For each vertex in some h(e i )/p complete F-path by adding siblings to the vertices in p. s 0 s t t 2 Yann Ponty, Balaji Raman, édric Saule

26 Inside/outside algorithm F-path using a given arc e (s t ) Inside: Some F-path built from t Distinguished arc e Outside: ontext where e is used: Path p = (s 0, s,..., s ) using F-arcs (e,..., e k ) For each vertex in some h(e i )/p complete F-path by adding siblings to the vertices in p. s 0 s t t 2 Yann Ponty, Balaji Raman, édric Saule

27 Inside/outside algorithm F-path using a given arc e (s t ) Inside: Some F-path built from t Distinguished arc e Outside: ontext where e is used: Path p = (s 0, s,..., s ) using F-arcs (e,..., e k ) For each vertex in some h(e i )/p complete F-path by adding siblings to the vertices in p. s 0 s t t 2 Yann Ponty, Balaji Raman, édric Saule

28 Inside/outside algorithm F-path using a given arc e (s t ) Inside: Some F-path built from t Distinguished arc e Outside: ontext where e is used: Path p = (s 0, s,..., s ) using F-arcs (e,..., e k ) For each vertex in some h(e i )/p complete F-path by adding siblings to the vertices in p. s 0 s t t 2 Yann Ponty, Balaji Raman, édric Saule

29 Inside/outside algorithm F-path using a given arc e (s t ) Inside: Some F-path built from t Distinguished arc e Outside: ontext where e is used: Path p = (s 0, s,..., s ) using F-arcs (e,..., e k ) For each vertex in some h(e i )/p complete F-path by adding siblings to the vertices in p. s 0 s t t 2 Yann Ponty, Balaji Raman, édric Saule

30 Inside/outside algorithm F-path using a given arc e (s t ) Inside: Some F-path built from t Distinguished arc e Outside: ontext where e is used: Path p = (s 0, s,..., s ) using F-arcs (e,..., e k ) For each vertex in some h(e i )/p complete F-path by adding siblings to the vertices in p. s s t t 2 Yann Ponty, Balaji Raman, édric Saule

31 Inside/outside algorithm F-path using a given arc e (s t ) Inside: Some F-path built from t Distinguished arc e Outside: ontext where e is used: Path p = (s 0, s,..., s ) using F-arcs (e,..., e k ) For each vertex in some h(e i )/p complete F-path by adding siblings to the vertices in p. s 0 s t t 2 If F-graph is acyclic and independent, this decomposition is complete and unambiguous, and induces the following DP equations for the cumulated probability p e of all F-paths featuring e = (s t ): bs π(e) s p e = t w s b s = s=s0 + w s0 e =(s t ) E s. t. s t π(e ) b s s t s s w s, s V Yann Ponty, Balaji Raman, édric Saule

32 Ensemble applications Denition (Weighted distribution) ssume a weighted, Boltzmann-like, distribution on the set T of F-Paths: e p P(p π) = π(e), p T. w s0 Ensemble related questions arise: How to generate a random F-path p from T w.r.t. weighted distribution? omplexities: Θ( E + V ) time/θ( V ) memory precomputation + O( p + e p h(e) ) time generation. What is the probability that a given arc be present in a random F-Path? Inside/outside algorithm omplexities: O( V + E + e E h(e) 2 )time/θ( V ) memory How to characterize the distribution of some additive features? Extraction of generalized moments Yann Ponty, Balaji Raman, édric Saule

33 Moments of a feature distribution Let T be the set of F-paths associated with a given hypergraph. Denition (Feature) feature is a function α : E R inherited additively by F-paths through α(p) = α(e) e p Example: Let us consider additive price α function over F-arcs and its associated random variable X α. Within the weighted distribution, what is the expected price of an F-path? B... the variance of the price X α of an F-path? How does the price X α correlates with the weight X π? = p T π(p) α(p) ws0 = E(X α) B = = p T π(p) α(p)2 ws0 E(X α X π) E(X α)e(x π) (E(X 2 α ) E(X α) 2 )(E(Xπ 2 ) E(Xπ)2 ) E(X α) 2 = E(X 2 α) E(X α) 2 an be expressed in term of the moments of the feature(s) distribution. How to extract them? Yann Ponty, Balaji Raman, édric Saule

34 oal: To extract the (combined) moment of an additive feature : E(X k α X α km m ) = π(p) w s0 p T m α i (p) km Diculty: Mixing additive (features) and multiplicative algebraic aspects (weighted distribution). Solution: Modify hypergraph to introduce controlled ambiguity. Denition (Feature-pointed F-graph) Let H = (s 0, V, E, π) be a weighted F-graph and α : E R some feature. The α-pointing of H is an F-graph H α = (s α 0, V α, E α, π α ) such that: Pointed versions of vertices are introduced V α := V {s α s V } New arcs are introduced to propagate a point or erase it. Propagation P α := { (s t) E s α (t,..., t α,..., t i k ) i [, k] } i= t Point erasure M α := (s t) E {(s α t)} E α := E P α M α Weight function { is extended to partially incorporate feature function π α (e π(e α ) If e P E ) := α(e ) π(e α ) Otherwise (e, M α e E α ) Yann Ponty, Balaji Raman, édric Saule

35 Example: Simple 0 (dashed) or (plain) feature function α Initial F-graph 4 Pointed F-graph π = π = π = π = 3 6 π = 4 6 π = π = 4 π = Initial F-Paths 6 nalogs in H of boxed F-path Rationale: Through pointing, each F-path p in H is duplicated p times in H ( analogs of p). Each copy features a point erasure over a dierent F-arc. The total weight, under π α, of all analogs of p is now π(p) α(p). Yann Ponty, Balaji Raman, édric Saule

36 Extracting moments Input: F-graph H = (s 0, V, E, π) oal: To extract the (higher order) moment of an additive feature : lgorithm E(X k α X α km m ) = π(p) w s0 p T pply weighted count algorithm to H w s0. m α i (p) km i= Point H repeatedly (commutative transform) H. pply weighted count algorithm to H Return w /w s0. w = p T π(p) m α i (p) km. omplexities: O ( 2 k V + e E ( h(e) + ) ( h(e) + 2)k) time O(2 k V ) memory, with k = m i= k i. Remarks: an be improved by pointing k i times in one go instead of pointing repeatedly. Without loss of generality, h(e) = 2 can be assumed (NF). i= Yann Ponty, Balaji Raman, édric Saule

37 Half time summary Message # Specic applications of Dynamic Programming could (and should) be detached from the equation, and be expressed at an abstract level. Sec. str. Nussinov Sec. str. Turner PK Dirks/Pierce Moments extraction Random generation ount Min/max score Weighted count rc probability... Message #2 Thanks to a pointing operator (formal derivative), one can extract arbitrary moments of features distribution under a weighted/boltzmann distribution. redits: Roytberg and Finkelstein for Hypergraph DP in Bioinformatics, L. Hwang for algebraic hypergraph DP, Flajolet et al for formal derivative through pointing... Let us now address the construction of suitable F-graphs... Yann Ponty, Balaji Raman, édric Saule

38 Half time summary Message # Specic applications of Dynamic Programming could (and should) be detached from the equation, and be expressed at an abstract level. Sec. str. Nussinov Sec. str. Turner PK Dirks/Pierce Hypergraph Moments extraction Random generation ount Min/max score Weighted count rc probability... Message #2 Thanks to a pointing operator (formal derivative), one can extract arbitrary moments of features distribution under a weighted/boltzmann distribution. redits: Roytberg and Finkelstein for Hypergraph DP in Bioinformatics, L. Hwang for algebraic hypergraph DP, Flajolet et al for formal derivative through pointing... Let us now address the construction of suitable F-graphs... Yann Ponty, Balaji Raman, édric Saule

39 Half time summary Message # Specic applications of Dynamic Programming could (and should) be detached from the equation, and be expressed at an abstract level. Sec. str. Nussinov Sec. str. Turner PK Dirks/Pierce Hypergraph Moments extraction Random generation ount Min/max score Weighted count rc probability... Message #2 Thanks to a pointing operator (formal derivative), one can extract arbitrary moments of features distribution under a weighted/boltzmann distribution. redits: Roytberg and Finkelstein for Hypergraph DP in Bioinformatics, L. Hwang for algebraic hypergraph DP, Flajolet et al for formal derivative through pointing... Let us now address the construction of suitable F-graphs... Yann Ponty, Balaji Raman, édric Saule

40 Mfold/nafold decomposition This decomposition is provably unambiguous and complete, and yields an F-graph that is acyclic, independent, and has Θ(n 2 ) vertices and O(n 3 ) arcs. Yann Ponty, Balaji Raman, édric Saule

41 Mfold/nafold decomposition This decomposition is provably unambiguous and complete, and yields an F-graph that is acyclic, independent, and has Θ(n 2 ) vertices and O(n 3 ) arcs. pplication lgorithm Time Memory Reference Energy minimization Min. score + π T O(n 3 ) O(n 2 ) [ZS8] Partition function Base-pairing probabilities Statistical sampling (k-samples) Moments of energy (Mean, Var.) k-th moment of additive features orrelations of additive features eneralized moments π count + e π T RT O(n 3 ) O(n 2 ) [Mc90] rc prob. + e π T RT O(n 3 ) O(n 2 ) [Mc90] Rand. gen. + e π T RT O(n 3 + k n log n) O(n 2 ) [DL03, Pon08] Moments + e π T RT O(n 3 ) O(n 2 ) [MMN05] Moments + e π T RT O(k 3.n 3 ) O(k.n 2 ) Moments + e π T RT O(n 3 ) O(n 2 ) Moments + e π T RT O(4 k.n 3 ) O(2 k.n 2 ) Yann Ponty, Balaji Raman, édric Saule

42 Basic pseudoknots (kutsu) Yann Ponty, Balaji Raman, édric Saule

43 Basic pseudoknots (kutsu) nambiguous decomposition capturing simple pseudoknots kutsu et al. yields O(n 4 )/O(n 4 ) time/memory algorithms for nearest neighbor energy model, and O(n 5 ) time/o(n 4 ) memory for Turner model. We can now answer new questions: What probability for the MFE within kutsu's class of conformations? What is the expected number of pseudoknots within a given sequence?... Yann Ponty, Balaji Raman, édric Saule

44 onclusion sing Hypergraphs, we were able to successfully detach conformation space from application. Implementation? We still have to mess with indices :( Start from F? Limited... (and already done by DP) but maybe worthy se Mathias Möhl's split types? ccount for additional parameters (RNMutants, RNBor... ) Pseudoknots gene scanning (in progress). Non-canonical motifs? Rebuild distributions from truncated moments ( Economics??) Yann Ponty, Balaji Raman, édric Saule

45 References I. ondon, B. Davy, B. Rastegari, S. Zhao, and F. Tarrant. lassifying RN pseudoknotted structures. Theoretical omputer Science, 320():3550, Y. Ding and E. Lawrence. statistical sampling algorithm for RN secondary structure prediction. Nucleic cids Res, 3(24): , R. B. Lyngsø and. N. S. Pedersen. RN pseudoknot prediction in energy-based models. Journal of omputational Biology, 7(3-4):409427, N. Leontis and E. Westhof. eometric nomenclature and classication of RN base pairs. RN, 7:49952, 200. J.S. Mcaskill. The equilibrium partition function and base pair binding probabilities for RN secondary structure. Biopolymers, 29:059, 990. István Miklós, Irmtraud M Meyer, and Borbála Nagy. Moments of the boltzmann distribution for rna secondary structures. Bull Math Biol, 67(5):03047, Sep Y. Ponty. Ecient sampling of RN secondary structures from the boltzmann ensemble of low-energy: The boustrophedon method. J Math Biol, 56(-2):0727, Jan M. Zuker and P. Stiegler. Optimal computer folding of large RN sequences using thermodynamics and auxiliary information. Nucleic cids Res, 9:3348, 98. Yann Ponty, Balaji Raman, édric Saule

46 omplexity of secondary structure folding Figure: lternative strategy for interior loops, creating the symmetric part rst and the asymmetric part afterward. Due to interior loops, the set F-arcs generated for the Q case have apparent cardinality in O(n 4 ), while we claims O(n 3 ) complexities. pragmatic answer may point out that it is common practice to upper bound the interior loop size (j j) + (i i) to a predened constant K = 30, bringing back the overall complexity to O(n 3 ). more theoretically-sound approach achieves a similar complexity while remaining exact by further decomposing internal loops into a symmetric loop followed by a fully asymmetric one, as illustrated in Figure. Such a modication introduces a new case in the decomposition and captures with l the asymmetry of the loop, but the log nature of the entropy term in the Turner model then requires a small approximation. Yann Ponty, Balaji Raman, édric Saule

Impact Of The Energy Model On The Complexity Of RNA Folding With Pseudoknots

Impact Of The Energy Model On The Complexity Of RNA Folding With Pseudoknots Impact Of The Energy Model On The omplexity Of RN Folding With Pseudoknots Saad Sheikh, Rolf Backofen Yann Ponty, niversity of Florida, ainesville, S lbert Ludwigs niversity, Freiburg, ermany LIX, NRS/Ecole

More information

RNA Secondary Structure Prediction

RNA Secondary Structure Prediction RN Secondary Structure Prediction Perry Hooker S 531: dvanced lgorithms Prof. Mike Rosulek University of Montana December 10, 2010 Introduction Ribonucleic acid (RN) is a macromolecule that is essential

More information

Flexible RNA design under structure and sequence constraints using a language theoretic approach

Flexible RNA design under structure and sequence constraints using a language theoretic approach Flexible RN design under structure and sequence constraints using a language theoretic approach Yu Zhou,, Yann Ponty Stéphane Vialette Jérôme Waldispühl Yi Zhang lain Denise LRI, niv. Paris-Sud, Orsay

More information

RNA Abstract Shape Analysis

RNA Abstract Shape Analysis ourse: iegerich RN bstract nalysis omplete shape iegerich enter of Biotechnology Bielefeld niversity robert@techfak.ni-bielefeld.de ourse on omputational RN Biology, Tübingen, March 2006 iegerich ourse:

More information

Towards firm foundations for the rational design of RNA molecules

Towards firm foundations for the rational design of RNA molecules 39 28 4 8 282 28 98 32 3 36 68 3 82 : (405)-43-> 43 42 4 40 8 25 37 46 26 36 24 27 E23_9 34 22 3 33 32 E23_2 2 48 o 9 Towards firm foundations for the rational design of RN molecules 47 2 E23_6 E23_ 45

More information

proteins are the basic building blocks and active players in the cell, and

proteins are the basic building blocks and active players in the cell, and 12 RN Secondary Structure Sources for this lecture: R. Durbin, S. Eddy,. Krogh und. Mitchison, Biological sequence analysis, ambridge, 1998 J. Setubal & J. Meidanis, Introduction to computational molecular

More information

A Novel Statistical Model for the Secondary Structure of RNA

A Novel Statistical Model for the Secondary Structure of RNA ISBN 978-1-8466-93-3 Proceedings of the 5th International ongress on Mathematical Biology (IMB11) Vol. 3 Nanjing, P. R. hina, June 3-5, 11 Novel Statistical Model for the Secondary Structure of RN Liu

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri RNA Structure Prediction Secondary

More information

The Ensemble of RNA Structures Example: some good structures of the RNA sequence

The Ensemble of RNA Structures Example: some good structures of the RNA sequence The Ensemble of RNA Structures Example: some good structures of the RNA sequence GGGGGUAUAGCUCAGGGGUAGAGCAUUUGACUGCAGAUCAAGAGGUCCCUGGUUCAAAUCCAGGUGCCCCCU free energy in kcal/mol (((((((..((((...))))...((((...))))(((((...)))))))))))).

More information

RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17

RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 Dr. Stefan Simm, 01.11.2016 simm@bio.uni-frankfurt.de RNA secondary structures a. hairpin loop b. stem c. bulge loop d. interior loop e. multi

More information

Lecture 8: RNA folding

Lecture 8: RNA folding 1/16, Lecture 8: RNA folding Hamidreza Chitsaz Colorado State University chitsaz@cs.colostate.edu Fall 2018 September 13, 2018 2/16, Nearest Neighbor Model 3/16, McCaskill s Algorithm for MFE Structure

More information

Combinatorial approaches to RNA folding Part II: Energy minimization via dynamic programming

Combinatorial approaches to RNA folding Part II: Energy minimization via dynamic programming ombinatorial approaches to RNA folding Part II: Energy minimization via dynamic programming Matthew Macauley Department of Mathematical Sciences lemson niversity http://www.math.clemson.edu/~macaule/ Math

More information

RNA Basics. RNA bases A,C,G,U Canonical Base Pairs A-U G-C G-U. Bases can only pair with one other base. wobble pairing. 23 Hydrogen Bonds more stable

RNA Basics. RNA bases A,C,G,U Canonical Base Pairs A-U G-C G-U. Bases can only pair with one other base. wobble pairing. 23 Hydrogen Bonds more stable RNA STRUCTURE RNA Basics RNA bases A,C,G,U Canonical Base Pairs A-U G-C G-U wobble pairing Bases can only pair with one other base. 23 Hydrogen Bonds more stable RNA Basics transfer RNA (trna) messenger

More information

Lecture 8: RNA folding

Lecture 8: RNA folding 1/16, Lecture 8: RNA folding Hamidreza Chitsaz Colorado State University chitsaz@cs.colostate.edu Spring 2018 February 15, 2018 2/16, Nearest Neighbor Model 3/16, McCaskill s Algorithm for MFE Structure

More information

Conserved RNA Structures. Ivo L. Hofacker. Institut for Theoretical Chemistry, University Vienna.

Conserved RNA Structures. Ivo L. Hofacker. Institut for Theoretical Chemistry, University Vienna. onserved RN Structures Ivo L. Hofacker Institut for Theoretical hemistry, University Vienna http://www.tbi.univie.ac.at/~ivo/ Bled, January 2002 Energy Directed Folding Predict structures from sequence

More information

98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006

98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006 98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006 8.3.1 Simple energy minimization Maximizing the number of base pairs as described above does not lead to good structure predictions.

More information

CS681: Advanced Topics in Computational Biology

CS681: Advanced Topics in Computational Biology CS681: Advanced Topics in Computational Biology Can Alkan EA224 calkan@cs.bilkent.edu.tr Week 10 Lecture 1 http://www.cs.bilkent.edu.tr/~calkan/teaching/cs681/ RNA folding Prediction of secondary structure

More information

Computational approaches for RNA energy parameter estimation

Computational approaches for RNA energy parameter estimation omputational approaches for RNA energy parameter estimation by Mirela Ştefania Andronescu M.Sc., The University of British olumbia, 2003 B.Sc., Bucharest Academy of Economic Studies, 1999 A THESIS SUBMITTED

More information

DNA/RNA Structure Prediction

DNA/RNA Structure Prediction C E N T R E F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 12 DNA/RNA Structure Prediction Epigenectics Epigenomics:

More information

of all secondary structures of k-point mutants of a is an RNA sequence s = s 1,..., s n obtained by mutating

of all secondary structures of k-point mutants of a is an RNA sequence s = s 1,..., s n obtained by mutating BIOINFORMICS Vol. 00 no. 00 2005 Pages 1 10 Energy landscape of k-point mutants of an RN molecule P. Clote 1,2, J. Waldispühl 1,3,4,, B. Behzadi 3, J.-M. Steyaert 3, 1 Department of Biology, Higgins 355,

More information

Combinatorial approaches to RNA folding Part I: Basics

Combinatorial approaches to RNA folding Part I: Basics Combinatorial approaches to RNA folding Part I: Basics Matthew Macauley Department of Mathematical Sciences Clemson University http://www.math.clemson.edu/~macaule/ Math 4500, Spring 2015 M. Macauley (Clemson)

More information

RNA Structure Prediction and Comparison. RNA folding

RNA Structure Prediction and Comparison. RNA folding RNA Structure Prediction and Comparison Session 3 RNA folding Faculty of Technology robert@techfak.uni-bielefeld.de Bielefeld, WS 2013/2014 Base Pair Maximization This was the first structure prediction

More information

Classified Dynamic Programming

Classified Dynamic Programming Bled, Feb. 2009 Motivation Our topic: Programming methodology A trade-off in dynamic programming between search space design and evaluation of candidates A trade-off between modifying your code and adding

More information

RNA Folding and Interaction Prediction: A Survey

RNA Folding and Interaction Prediction: A Survey RNA Folding and Interaction Prediction: A Survey Syed Ali Ahmed Graduate Center, City University of New York New York, NY November 19, 2015 Abstract The problem of computationally predicting the structure

More information

RNA secondary structure prediction. Farhat Habib

RNA secondary structure prediction. Farhat Habib RNA secondary structure prediction Farhat Habib RNA RNA is similar to DNA chemically. It is usually only a single strand. T(hyamine) is replaced by U(racil) Some forms of RNA can form secondary structures

More information

A Method for Aligning RNA Secondary Structures

A Method for Aligning RNA Secondary Structures Method for ligning RN Secondary Structures Jason T. L. Wang New Jersey Institute of Technology J Liu, JTL Wang, J Hu and B Tian, BM Bioinformatics, 2005 1 Outline Introduction Structural alignment of RN

More information

Local Alignment of RNA Sequences with Arbitrary Scoring Schemes

Local Alignment of RNA Sequences with Arbitrary Scoring Schemes Local Alignment of RNA Sequences with Arbitrary Scoring Schemes Rolf Backofen 1, Danny Hermelin 2, ad M. Landau 2,3, and Oren Weimann 4 1 Institute of omputer Science, Albert-Ludwigs niversität Freiburg,

More information

Quantitative modeling of RNA single-molecule experiments. Ralf Bundschuh Department of Physics, Ohio State University

Quantitative modeling of RNA single-molecule experiments. Ralf Bundschuh Department of Physics, Ohio State University Quantitative modeling of RN single-molecule experiments Ralf Bundschuh Department of Physics, Ohio State niversity ollaborators: lrich erland, LM München Terence Hwa, San Diego Outline: Single-molecule

More information

BIOINFORMATICS. Fast evaluation of internal loops in RNA secondary structure prediction. Abstract. Introduction

BIOINFORMATICS. Fast evaluation of internal loops in RNA secondary structure prediction. Abstract. Introduction BIOINFORMATICS Fast evaluation of internal loops in RNA secondary structure prediction Abstract Motivation: Though not as abundant in known biological processes as proteins, RNA molecules serve as more

More information

Structure-Based Comparison of Biomolecules

Structure-Based Comparison of Biomolecules Structure-Based Comparison of Biomolecules Benedikt Christoph Wolters Seminar Bioinformatics Algorithms RWTH AACHEN 07/17/2015 Outline 1 Introduction and Motivation Protein Structure Hierarchy Protein

More information

COMBINATORICS OF LOCALLY OPTIMAL RNA SECONDARY STRUCTURES

COMBINATORICS OF LOCALLY OPTIMAL RNA SECONDARY STRUCTURES COMBINATORICS OF LOCALLY OPTIMAL RNA SECONDARY STRUCTURES ÉRIC FUSY AND PETER CLOTE Abstract. It is a classical result of Stein and Waterman that the asymptotic number of RNA secondary structures is 1.104366

More information

Computational Approaches for determination of Most Probable RNA Secondary Structure Using Different Thermodynamics Parameters

Computational Approaches for determination of Most Probable RNA Secondary Structure Using Different Thermodynamics Parameters Computational Approaches for determination of Most Probable RNA Secondary Structure Using Different Thermodynamics Parameters 1 Binod Kumar, Assistant Professor, Computer Sc. Dept, ISTAR, Vallabh Vidyanagar,

More information

Sparse RNA Folding Revisited: Space-Efficient Minimum Free Energy Prediction

Sparse RNA Folding Revisited: Space-Efficient Minimum Free Energy Prediction Sparse RNA Folding Revisited: Space-Efficient Minimum Free Energy Prediction Sebastian Will 1 and Hosna Jabbari 2 1 Bioinformatics/IZBI, University Leipzig, swill@csail.mit.edu 2 Ingenuity Lab, National

More information

The Double Helix. CSE 417: Algorithms and Computational Complexity! The Central Dogma of Molecular Biology! DNA! RNA! Protein! Protein!

The Double Helix. CSE 417: Algorithms and Computational Complexity! The Central Dogma of Molecular Biology! DNA! RNA! Protein! Protein! The Double Helix SE 417: lgorithms and omputational omplexity! Winter 29! W. L. Ruzzo! Dynamic Programming, II" RN Folding! http://www.rcsb.org/pdb/explore.do?structureid=1t! Los lamos Science The entral

More information

CONTRAfold: RNA Secondary Structure Prediction without Physics-Based Models

CONTRAfold: RNA Secondary Structure Prediction without Physics-Based Models Supplementary Material for CONTRAfold: RNA Secondary Structure Prediction without Physics-Based Models Chuong B Do, Daniel A Woods, and Serafim Batzoglou Stanford University, Stanford, CA 94305, USA, {chuongdo,danwoods,serafim}@csstanfordedu,

More information

Moments of the Boltzmann distribution for RNA secondary structures

Moments of the Boltzmann distribution for RNA secondary structures Bulletin of Mathematical Biology 67 (2005) 1031 1047 www.elsevier.com/locate/ybulm Moments of the Boltzmann distribution for RNA secondary structures István Miklós a, Irmtraud M. Meyer b,,borbála Nagy

More information

Computational RNA Secondary Structure Design:

Computational RNA Secondary Structure Design: omputational RN Secondary Structure Design: Empirical omplexity and Improved Methods by Rosalía guirre-hernández B.Sc., niversidad Nacional utónoma de México, 996 M. Eng., niversidad Nacional utónoma de

More information

A tutorial on RNA folding methods and resources

A tutorial on RNA folding methods and resources A tutorial on RNA folding methods and resources Alain Denise, LRI/IGM, Université Paris-Sud with invaluable help from Yann Ponty, CNRS/Ecole Polytechnique 1 Master BIBS 2014-2015 Goals To help your work

More information

Lecture 12. DNA/RNA Structure Prediction. Epigenectics Epigenomics: Gene Expression

Lecture 12. DNA/RNA Structure Prediction. Epigenectics Epigenomics: Gene Expression C N F O N G A V B O N F O M A C S V U Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 12 DNA/NA Structure Prediction pigenectics pigenomics: Gene xpression ranscription factors

More information

Sparse RNA folding revisited: space efficient minimum free energy structure prediction

Sparse RNA folding revisited: space efficient minimum free energy structure prediction DOI 10.1186/s13015-016-0071-y Algorithms for Molecular Biology RESEARCH ARTICLE Sparse RNA folding revisited: space efficient minimum free energy structure prediction Sebastian Will 1* and Hosna Jabbari

More information

Junction-Explorer Help File

Junction-Explorer Help File Junction-Explorer Help File Dongrong Wen, Christian Laing, Jason T. L. Wang and Tamar Schlick Overview RNA junctions are important structural elements of three or more helices in the organization of the

More information

Lab III: Computational Biology and RNA Structure Prediction. Biochemistry 208 David Mathews Department of Biochemistry & Biophysics

Lab III: Computational Biology and RNA Structure Prediction. Biochemistry 208 David Mathews Department of Biochemistry & Biophysics Lab III: Computational Biology and RNA Structure Prediction Biochemistry 208 David Mathews Department of Biochemistry & Biophysics Contact Info: David_Mathews@urmc.rochester.edu Phone: x51734 Office: 3-8816

More information

arxiv: v1 [q-bio.bm] 16 Aug 2015

arxiv: v1 [q-bio.bm] 16 Aug 2015 Asymptotic connectivity for the network of RNA secondary structures. Clote arxiv:1508.03815v1 [q-bio.bm] 16 Aug 2015 Biology Department, Boston College, Chestnut Hill, MA 02467, clote@bc.edu Abstract Given

More information

RNA folding with hard and soft constraints

RNA folding with hard and soft constraints DOI 10.1186/s13015-016-0070-z Algorithms for Molecular Biology RESEARCH RNA folding with hard and soft constraints Ronny Lorenz 1*, Ivo L. Hofacker 1,2,3 and Peter F. Stadler 1,3,4,5,6,7 Open Access Abstract

More information

arxiv: v3 [math.co] 17 Jan 2018

arxiv: v3 [math.co] 17 Jan 2018 An infinite class of unsaturated rooted trees corresponding to designable RNA secondary structures arxiv:1709.08088v3 [math.co] 17 Jan 2018 Jonathan Jedwab Tara Petrie Samuel Simon 23 September 2017 (revised

More information

Computational and physical models of RNA structure

Computational and physical models of RNA structure omputational and physical models of RN structure Ralf Bundschuh Ohio State niversity July 25, 2007 Ralf Bundschuh (Ohio State niversity) Modelling RN structure July 25, 2007 1 / 98 Outline of all lectures

More information

Lecture 7: Dynamic Programming I: Optimal BSTs

Lecture 7: Dynamic Programming I: Optimal BSTs 5-750: Graduate Algorithms February, 06 Lecture 7: Dynamic Programming I: Optimal BSTs Lecturer: David Witmer Scribes: Ellango Jothimurugesan, Ziqiang Feng Overview The basic idea of dynamic programming

More information

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

Inference in Graphical Models Variable Elimination and Message Passing Algorithm Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption

More information

Computing the partition function and sampling for saturated secondary structures of RNA, with respect to the Turner energy model

Computing the partition function and sampling for saturated secondary structures of RNA, with respect to the Turner energy model Computing the partition function and sampling for saturated secondary structures of RNA, with respect to the Turner energy model J. Waldispühl 1,3 P. Clote 1,2, 1 Department of Biology, Higgins 355, Boston

More information

Notes on MapReduce Algorithms

Notes on MapReduce Algorithms Notes on MapReduce Algorithms Barna Saha 1 Finding Minimum Spanning Tree of a Dense Graph in MapReduce We are given a graph G = (V, E) on V = N vertices and E = m N 1+c edges for some constant c > 0. Our

More information

Search and Lookahead. Bernhard Nebel, Julien Hué, and Stefan Wölfl. June 4/6, 2012

Search and Lookahead. Bernhard Nebel, Julien Hué, and Stefan Wölfl. June 4/6, 2012 Search and Lookahead Bernhard Nebel, Julien Hué, and Stefan Wölfl Albert-Ludwigs-Universität Freiburg June 4/6, 2012 Search and Lookahead Enforcing consistency is one way of solving constraint networks:

More information

On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar

On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar Proceedings of Machine Learning Research vol 73:153-164, 2017 AMBN 2017 On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar Kei Amii Kyoto University Kyoto

More information

On low energy barrier folding pathways for nucleic acid sequences

On low energy barrier folding pathways for nucleic acid sequences On low energy barrier folding pathways for nucleic acid sequences Leigh-Anne Mathieson and Anne Condon U. British Columbia, Department of Computer Science, Vancouver, BC, Canada Abstract. Secondary structure

More information

RNA Search and! Motif Discovery" Genome 541! Intro to Computational! Molecular Biology"

RNA Search and! Motif Discovery Genome 541! Intro to Computational! Molecular Biology RNA Search and! Motif Discovery" Genome 541! Intro to Computational! Molecular Biology" Day 1" Many biologically interesting roles for RNA" RNA secondary structure prediction" 3 4 Approaches to Structure

More information

Complete Suboptimal Folding of RNA and the Stability of Secondary Structures

Complete Suboptimal Folding of RNA and the Stability of Secondary Structures Stefan Wuchty 1 Walter Fontana 1,2 Ivo L. Hofacker 1 Peter Schuster 1,2 1 Institut für Theoretische Chemie, Universität Wien, Währingerstrasse 17, A-1090 Wien, Austria Complete Suboptimal Folding of RNA

More information

DYNAMIC PROGRAMMING ALGORITHMS FOR RNA STRUCTURE PREDICTION WITH BINDING SITES

DYNAMIC PROGRAMMING ALGORITHMS FOR RNA STRUCTURE PREDICTION WITH BINDING SITES DYNAMIC PROGRAMMING ALGORITHMS FOR RNA STRUCTURE PREDICTION WITH BINDING SITES UNYANEE POOLSAP, YUKI KATO, TATSUYA AKUTSU Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho,

More information

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler Complexity Theory Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Reinhard

More information

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181. Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität

More information

Supplementary Material

Supplementary Material Supplementary Material Sm-I Formal Description of the Sampling Process In the sequel, given an RNA molecule r consisting of n nucleotides, we denote the corresponding sequence fragment from position i

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Probabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford

Probabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford Probabilistic Model Checking Michaelmas Term 20 Dr. Dave Parker Department of Computer Science University of Oxford Overview PCTL for MDPs syntax, semantics, examples PCTL model checking next, bounded

More information

CFG PSA Algorithm. Sequence Alignment Guided By Common Motifs Described By Context Free Grammars

CFG PSA Algorithm. Sequence Alignment Guided By Common Motifs Described By Context Free Grammars FG PS lgorithm Sequence lignment Guided By ommon Motifs Described By ontext Free Grammars motivation Find motifs- conserved regions that indicate a biological function or signature. Other algorithm do

More information

Rapid Dynamic Programming Algorithms for RNA Secondary Structure

Rapid Dynamic Programming Algorithms for RNA Secondary Structure ADVANCES IN APPLIED MATHEMATICS 7,455-464 I f Rapid Dynamic Programming Algorithms for RNA Secondary Structure MICHAEL S. WATERMAN* Depurtments of Muthemutics und of Biologicul Sciences, Universitk of

More information

BIOINFORMATICS. Prediction of RNA secondary structure based on helical regions distribution

BIOINFORMATICS. Prediction of RNA secondary structure based on helical regions distribution BIOINFORMATICS Prediction of RNA secondary structure based on helical regions distribution Abstract Motivation: RNAs play an important role in many biological processes and knowing their structure is important

More information

RNAscClust clustering RNAs using structure conservation and graph-based motifs

RNAscClust clustering RNAs using structure conservation and graph-based motifs RNsclust clustering RNs using structure conservation and graph-based motifs Milad Miladi 1 lexander Junge 2 miladim@informatikuni-freiburgde ajunge@rthdk 1 Bioinformatics roup, niversity of Freiburg 2

More information

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Computer Science & Engineering 423/823 Design and Analysis of Algorithms Computer Science & Engineering 423/823 Design and Analysis of Algorithms Lecture 03 Dynamic Programming (Chapter 15) Stephen Scott and Vinodchandran N. Variyam sscott@cse.unl.edu 1/44 Introduction Dynamic

More information

Chapter 1. A Method to Predict the 3D Structure of an RNA Scaffold. Xiaojun Xu and Shi-Jie Chen. Abstract. 1 Introduction

Chapter 1. A Method to Predict the 3D Structure of an RNA Scaffold. Xiaojun Xu and Shi-Jie Chen. Abstract. 1 Introduction Chapter 1 Abstract The ever increasing discoveries of noncoding RNA functions draw a strong demand for RNA structure determination from the sequence. In recently years, computational studies for RNA structures,

More information

Grand Plan. RNA very basic structure 3D structure Secondary structure / predictions The RNA world

Grand Plan. RNA very basic structure 3D structure Secondary structure / predictions The RNA world Grand Plan RNA very basic structure 3D structure Secondary structure / predictions The RNA world very quick Andrew Torda, April 2017 Andrew Torda 10/04/2017 [ 1 ] Roles of molecules RNA DNA proteins genetic

More information

This lecture covers Chapter 5 of HMU: Context-free Grammars

This lecture covers Chapter 5 of HMU: Context-free Grammars This lecture covers Chapter 5 of HMU: Context-free rammars (Context-free) rammars (Leftmost and Rightmost) Derivations Parse Trees An quivalence between Derivations and Parse Trees Ambiguity in rammars

More information

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison 10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:

More information

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved. Chapter 11 Approximation Algorithms Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights reserved. 1 Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should

More information

Feature selection. c Victor Kitov August Summer school on Machine Learning in High Energy Physics in partnership with

Feature selection. c Victor Kitov August Summer school on Machine Learning in High Energy Physics in partnership with Feature selection c Victor Kitov v.v.kitov@yandex.ru Summer school on Machine Learning in High Energy Physics in partnership with August 2015 1/38 Feature selection Feature selection is a process of selecting

More information

Local Probability Models

Local Probability Models Readings: K&F 3.4, 5.~5.5 Local Probability Models Lecture 3 pr 4, 2 SE 55, Statistical Methods, Spring 2 Instructor: Su-In Lee University of Washington, Seattle Outline Last time onditional parameterization

More information

Loopy Belief Propagation for Bipartite Maximum Weight b-matching

Loopy Belief Propagation for Bipartite Maximum Weight b-matching Loopy Belief Propagation for Bipartite Maximum Weight b-matching Bert Huang and Tony Jebara Computer Science Department Columbia University New York, NY 10027 Outline 1. Bipartite Weighted b-matching 2.

More information

arxiv: v1 [q-bio.bm] 21 Oct 2010

arxiv: v1 [q-bio.bm] 21 Oct 2010 TT2NE: A novel algorithm to predict RNA secondary structures with pseudoknots Michaël Bon and Henri Orland Institut de Physique Théorique, CEA Saclay, CNRS URA 2306, arxiv:1010.4490v1 [q-bio.bm] 21 Oct

More information

LOCAL SEQUENCE-STRUCTURE MOTIFS IN RNA

LOCAL SEQUENCE-STRUCTURE MOTIFS IN RNA Journal of Bioinformatics and omputational Biology c Imperial ollege Press LOL SEQENE-STRTRE MOTIFS IN RN ROLF BKOFEN and SEBSTIN WILL hair for Bioinformatics at the Institute of omputer Science, Friedrich-Schiller-niversitaet

More information

DANNY BARASH ABSTRACT

DANNY BARASH ABSTRACT JOURNAL OF COMPUTATIONAL BIOLOGY Volume 11, Number 6, 2004 Mary Ann Liebert, Inc. Pp. 1169 1174 Spectral Decomposition for the Search and Analysis of RNA Secondary Structure DANNY BARASH ABSTRACT Scales

More information

The Multistrand Simulator: Stochastic Simulation of the Kinetics of Multiple Interacting DNA Strands

The Multistrand Simulator: Stochastic Simulation of the Kinetics of Multiple Interacting DNA Strands The Multistrand Simulator: Stochastic Simulation of the Kinetics of Multiple Interacting DNA Strands Joseph Schaeffer, Caltech (slides by John Reif) Entropy: Is a measure of disorder or randomness of a

More information

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Computer Science & Engineering 423/823 Design and Analysis of Algorithms Computer Science & Engineering 423/823 Design and Analysis of s Lecture 09 Dynamic Programming (Chapter 15) Stephen Scott (Adapted from Vinodchandran N. Variyam) 1 / 41 Spring 2010 Dynamic programming

More information

Notes. Relations. Introduction. Notes. Relations. Notes. Definition. Example. Slides by Christopher M. Bourke Instructor: Berthe Y.

Notes. Relations. Introduction. Notes. Relations. Notes. Definition. Example. Slides by Christopher M. Bourke Instructor: Berthe Y. Relations Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry Spring 2006 Computer Science & Engineering 235 Introduction to Discrete Mathematics Sections 7.1, 7.3 7.5 of Rosen cse235@cse.unl.edu

More information

Disconnecting Networks via Node Deletions

Disconnecting Networks via Node Deletions 1 / 27 Disconnecting Networks via Node Deletions Exact Interdiction Models and Algorithms Siqian Shen 1 J. Cole Smith 2 R. Goli 2 1 IOE, University of Michigan 2 ISE, University of Florida 2012 INFORMS

More information

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University General informations (1) Office hours: by appointment Office: TR3018

More information

Dynamic Programming. Weighted Interval Scheduling. Algorithmic Paradigms. Dynamic Programming

Dynamic Programming. Weighted Interval Scheduling. Algorithmic Paradigms. Dynamic Programming lgorithmic Paradigms Dynamic Programming reed Build up a solution incrementally, myopically optimizing some local criterion Divide-and-conquer Break up a problem into two sub-problems, solve each sub-problem

More information

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its

More information

A fast algorithm to generate necklaces with xed content

A fast algorithm to generate necklaces with xed content Theoretical Computer Science 301 (003) 477 489 www.elsevier.com/locate/tcs Note A fast algorithm to generate necklaces with xed content Joe Sawada 1 Department of Computer Science, University of Toronto,

More information

Phylogenetic Networks, Trees, and Clusters

Phylogenetic Networks, Trees, and Clusters Phylogenetic Networks, Trees, and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science Rice University Houston, TX 77005, USA nakhleh@cs.rice.edu 2 Department of Biology University

More information

Some hierarchies for the communication complexity measures of cooperating grammar systems

Some hierarchies for the communication complexity measures of cooperating grammar systems Some hierarchies for the communication complexity measures of cooperating grammar systems Juraj Hromkovic 1, Jarkko Kari 2, Lila Kari 2 June 14, 2010 Abstract We investigate here the descriptional and

More information

Masterarbeit. Titel der Masterarbeit. Computational Refinement of SHAPE RNA probing Experiments. verfasst von. Roman Wilhelm Ochsenreiter, BSc

Masterarbeit. Titel der Masterarbeit. Computational Refinement of SHAPE RNA probing Experiments. verfasst von. Roman Wilhelm Ochsenreiter, BSc Masterarbeit Titel der Masterarbeit omputational Refinement of SHPE RN probing Experiments verfasst von Roman Wilhelm Ochsenreiter, BSc angestrebter akademischer rad Master of Science (MSc) Wien, 2015

More information

Motif Prediction in Amino Acid Interaction Networks

Motif Prediction in Amino Acid Interaction Networks Motif Prediction in Amino Acid Interaction Networks Omar GACI and Stefan BALEV Abstract In this paper we represent a protein as a graph where the vertices are amino acids and the edges are interactions

More information

Outline. Complexity Theory. Example. Sketch of a log-space TM for palindromes. Log-space computations. Example VU , SS 2018

Outline. Complexity Theory. Example. Sketch of a log-space TM for palindromes. Log-space computations. Example VU , SS 2018 Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 3. Logarithmic Space Reinhard Pichler Institute of Logic and Computation DBAI Group TU Wien 3. Logarithmic Space 3.1 Computational

More information

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION AND CALIBRATION Calculation of turn and beta intrinsic propensities. A statistical analysis of a protein structure

More information

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes CDA6530: Performance Models of Computers and Networks Chapter 3: Review of Practical Stochastic Processes Definition Stochastic process X = {X(t), t2 T} is a collection of random variables (rvs); one rv

More information

GRAPH ALGORITHMS Week 7 (13 Nov - 18 Nov 2017)

GRAPH ALGORITHMS Week 7 (13 Nov - 18 Nov 2017) GRAPH ALGORITHMS Week 7 (13 Nov - 18 Nov 2017) C. Croitoru croitoru@info.uaic.ro FII November 12, 2017 1 / 33 OUTLINE Matchings Analytical Formulation of the Maximum Matching Problem Perfect Matchings

More information

A Generalization of Wigner s Law

A Generalization of Wigner s Law A Generalization of Wigner s Law Inna Zakharevich June 2, 2005 Abstract We present a generalization of Wigner s semicircle law: we consider a sequence of probability distributions (p, p 2,... ), with mean

More information

Chapter 6. Properties of Regular Languages

Chapter 6. Properties of Regular Languages Chapter 6 Properties of Regular Languages Regular Sets and Languages Claim(1). The family of languages accepted by FSAs consists of precisely the regular sets over a given alphabet. Every regular set is

More information

Relations Graphical View

Relations Graphical View Introduction Relations Computer Science & Engineering 235: Discrete Mathematics Christopher M. Bourke cbourke@cse.unl.edu Recall that a relation between elements of two sets is a subset of their Cartesian

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by

More information

Soft Inference and Posterior Marginals. September 19, 2013

Soft Inference and Posterior Marginals. September 19, 2013 Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard inference Give me a single solution Viterbi algorithm Maximum spanning tree (Chu-Liu-Edmonds alg.) Soft inference

More information

Background: comparative genomics. Sequence similarity. Homologs. Similarity vs homology (2) Similarity vs homology. Sequence Alignment (chapter 6)

Background: comparative genomics. Sequence similarity. Homologs. Similarity vs homology (2) Similarity vs homology. Sequence Alignment (chapter 6) Sequence lignment (chapter ) he biological problem lobal alignment Local alignment Multiple alignment Background: comparative genomics Basic question in biology: what properties are shared among organisms?

More information

On Finding Subpaths With High Demand

On Finding Subpaths With High Demand Zuse Institute Berlin Takustr. 7 14195 Berlin Germany STEPHAN SCHWARTZ, LEONARDO BALESTRIERI, RALF BORNDÖRFER On Finding Subpaths With High Demand ZIB Report 18-27 (Juni 2018) Zuse Institute Berlin Takustr.

More information