ROC TPR FPR
|
|
- Kory Hicks
- 5 years ago
- Views:
Transcription
1 HW5
2 2
3 3
4 TPR RO TPR = True Positive Rate = Sensitivity = Recall = TP/(TP+FN) FPR = False Positive Rate = 1 Specificity = FP/(FP+TN) Precision, aka PPV = TP/(TP+FP) FPR 4
5 TPR FPR 5
6 * scores * orfs$len 6
7 scores log2(orfs$len) 7
8 FPR FPR scores scores TPR TPR orfs$len log2(orfs$len)
9 RNA Search and Motif Discovery SEP 590 A omputational Biology
10 15
11 Approaches to Structure Prediction Maximum Pairing + works on single sequences + simple - too inaccurate Minimum Energy + works on single sequences - ignores pseudoknots - only finds optimal fold Partition Function + finds all folds - ignores pseudoknots
12 Optimal pairing of r i... r j Two possibilities j Unpaired: Find best pairing of r i... r j-1 j! i! j-1! j Paired (with some k): Find best r i... r k-1 + best r k+1... r j-1 plus 1 i! k-1! k! Why is it slow? Why do pseudoknots matter? j! j-1! k+1!
13 Nussinov: A omputation Order B(i,j) = # pairs in optimal pairing of r i... r j B(i,j) = 0 for all i, j with i j-4; otherwise B(i,j) = max of: B(i,j-1) Or energy max { B(i,k-1)+1+B(k+1,j-1) i k < j-4 and r k -r j may pair} Time: O(n 3 ) Loop-based energy version is better; recurrences similar, slightly messier K=
14 Today Structure prediction via comparative analysis ovariance Models (Ms) represent RNA sequence/structure motifs Fast M search Motif Discovery Applications in prokaryotes & vertebrates 19
15 Approaches, II omparative sequence analysis + handles all pairings (potentially incl. pseudoknots) - requires several (many?) aligned, appropriately diverged sequences Stochastic ontext-free rammars Roughly combines min energy & comparative, but no pseudoknots Physical experiments (x-ray crystalography, NMR) Today
16 ovariation is strong evidence for base pairing 21
17 Example: Ribosomal Autoregulation: Excess L19 represses L19 (RF00556; similar) A B P2 L19 (rpls) mrna leader Y U A U A R R Y U R R U 5' Y 3' P1 N N N TSS nucleotide identity nucleotide present 97% 97% 90% 90% 75% 75% 50% A stem loop always present compensatory mutations compatible mutations Watson-rick base pair other base interaction 5' P1 P2 RBS Start A A U U U U U A A U A U U 3' U U U A A A U A A U A U A A U U U U B. subtilis L19 mrna leader U U U U U L19 U U U A A A A U U U U U A U A 3' U A A U U U A 5' A A A A? 22
18 Mutual Information M ij = f f xi,xj xi,xj log xi,xj 2 f xi f xj ; 0 M ij 2 Max when no seq conservation but perfect pairing MI = expected score gain from using a pair state (below) Finding optimal MI, (i.e. opt pairing of cols) is hard(?) Finding optimal MI without pseudoknots can be done by dynamic programming 23
19 27
20 omputational Problems How to predict secondary structure How to model an RNA motif (I.e., sequence/structure pattern) iven a motif, how to search for instances iven (unaligned) sequences, find motifs How to score discovered motifs How to leverage prior knowledge 29
21 Motif Description
22 RNA Motif Models ovariance Models (Eddy & Durbin 1994) aka profile stochastic context-free grammars aka hidden Markov models on steroids Model position-specific nucleotide preferences and base-pair preferences Pro: accurate on: model building hard, search slow 32
23 Eddy & Durbin 1994: What A probabilistic model for RNA families The ovariance Model A Stochastic ontext-free rammar A generalization of a profile HMM Algorithms for Training From aligned or unaligned sequences Automates comparative analysis omplements Nusinov/Zucker RNA folding Algorithms for searching 33
24 Main Results Very accurate search for trna (Precursor to trnascanse - current favorite) iven sufficient data, model construction comparable to, but not quite as good as, human experts Some quantitative info on importance of pseudoknots and other tertiary features 34
25 Probabilistic Model Search As with HMMs, given a sequence, you calculate likelihood ratio that the model could generate the sequence, vs a background model You set a score threshold Anything above threshold a hit Scoring: Forward / Inside algorithm - sum over all paths Viterbi approximation - find single best path (Bonus: alignment & structure prediction) 35
26 Example: searching for trnas 36
27 Profile Hmm Structure Mj: Match states (20 emission probabilities) Ij: Insert states (Background emission probabilities) Dj: Delete states (silent - no emission) 38
28 How to model an RNA Motif? onceptually, start with a profile HMM: from a multiple alignment, estimate nucleotide/ insert/delete preferences for each position given a new seq, estimate likelihood that it could be generated by the model, & align it to the model mostly all del ins 39
29 How to model an RNA Motif? Add column pairs and pair emission probabilities for base-paired regions <<<<<<< paired columns >>>>>>> 40
30 Profile Hmm Structure Mj: Match states (20 emission probabilities) Ij: Insert states (Background emission probabilities) Dj: Delete states (silent - no emission) 41
31 M Structure A: Sequence + structure B: the M guide tree : probabilities of letters/ pairs & of indels Think of each branch being an HMM emitting both sides of a helix (but 3 side emitted in reverse order) 47
32 Overall M Architecture One box ( node ) per node of guide tree BE/MATL/INS/DEL just like an HMM MATP & BIF are the key additions: MATP emits pairs of symbols, modeling basepairs; BIF allows multiple helices 48
33 M Viterbi Alignment (the inside algorithm) x i x ij = i th letter of input = substring i,..., j of input T yz = P(transition y z) y E xi,x j = P(emission of x i, x j from state y) S ij y = max π log P(x ij gen'd starting in state y via path π) 49
34 M Viterbi Alignment (the inside algorithm) S ij y = max π log P(x ij generated starting in state y via path π) S ij y = % z max z [S i+1, j 1 ' z max z [S i+1, j ' z & max z [S i, j 1 ' z ' max z [S i, j (' max i<k j [S i,k y left y + logt yz + log E xi,x j y + logt yz + log E xi ] match pair ] match/insert left + logt yz + log E x j ] match/insert right + logt yz ] delete y + S right k +1, j ] bifurcation Time O(qn 3 ), q states, seq len n Time O(qn 3 ), q states, seq len n compare: O(qn) for profile HMM 50
35 An Important Application: Rfam
36 Rfam an RNA family DB riffiths-jones, et al., NAR 03, 05, 08, 11, 12 Was biggest scientific comp user in Europe cpu cluster for a month per release Rapidly growing: Rel 1.0, 1/03: 25 families, 55k instances Rel 7.0, 3/05: 503 families, 363k instances Rel 9.0, 7/08: 603 families, 636k instances Rel 9.1, 1/09: 1372 families, 1148k instances Rel 10.0, 1/10: 1446 families, 3193k instances Rel 11.0, 8/12: 2208 families, 6125k instances DB size: ~8B ~160B ~320B 58
37 RF00037: Example Rfam Family Input (hand-curated): MSA seed alignment SS_cons Score Thresh T Window Len W Output: M scan results & full alignment phylogeny, etc. IRE (partial seed alignment): Hom.sap. UUUUUAAAUUUUAUAA Hom.sap. UUUUU.UUAAAUUUUAUAA Hom.sap. UUUUUUUAAAUUUA.AA Hom.sap. UUUAU..AUAAAUUAU.AUAAA Hom.sap. UUUUUUAAAUUUUAUAA Hom.sap. AUUAU..AAAUUUU.AUAAU Hom.sap. UUU..UUAAAUUUUAAA Hom.sap. UUAU..AAAUAUU.AUAU Hom.sap. AUUAU..AAAUUU.AUAAU av.por. UUUUUAAAUUUAA Mus.mus. UAUAU..AAAUAUU.AUAU Mus.mus. UUUUUUAAAUUUAAAA Mus.mus. UAUUUUAAAUUUUAAAA Rat.nor. UAUAU..AAAUAU.AUAU Rat.nor. UAUUUUUAAAUUUUAAA SS_cons <<<<<...<<<<<...>>>>>.>>>>> 62
38 An Important Need: Faster Search
39 Homology search Homolog similar by descent from common ancestor Sequence-based Smith-Waterman FASTA BLAST For RNA, sharp decline in sensitivity at ~60-70% identity So, use structure, too 73
40 Impact of RNA homology search B. subtilis! glycine riboswitch! (Barrick, et al., 2004)! operon! L. innocua! A. tumefaciens! V. cholera! M. tuberculosis! (and 19 more species)! 74
41 Impact of RNA homology search B. subtilis! L. innocua! glycine riboswitch! (Barrick, et al., 2004)! operon! (Mandal, et al., 2004)! A. tumefaciens! V. cholera! M. tuberculosis! (and 19 more species)! (and 42 more species)! BLAST-based M-based 75
42 Faster enome Annotation of Non-coding RNAs Without Loss of Accuracy Zasha Weinberg & W.L. Ruzzo Recomb 04, ISMB 04, Bioinfo 06
43 RaveNnA: enome Scale RNA Search Typically 100x speedup over raw M, w/ no loss in accuracy: Drop structure from M to create a (faster) HMM Use that to pre-filter sequence; Discard parts where, provably, M score < threshold; Actually run M on the rest (the promising parts) Assignment of HMM transition/emission scores is key (a large convex optimization problem) Weinberg & Ruzzo, Bioinformatics, 2004,
44 M s are good, but slow Rfam Reality EMBL Our Work EMBL Rfam oal EMBL BLAST Ravenna junk M hits 1 month, 1000 computers M hits ~2 months, 1000 computers M junk hits 10 years, 1000 computers 79
45 Oversimplified M (for pedagogical purposes only) A U A U A U A U 81
46 M to HMM A U A U A U A U A U A U A U A U M HMM emisions per state 5 emissions per state, 2x states
47 Key Issue: 25 scores 10 M HMM A U P A U A U L R A U Need: log Viterbi scores M HMM 83
48 Key Issue: 25 scores 10 M A U P A U Need: log Viterbi scores M HMM P AA L A + R A P A L A + R P A L A + R P AU L A + R U P A L A + R A U L P A L + R A P L + R P L + R P U L + R U P L + R R HMM A U NB: HMM not a prob. model 85
49 Rigorous Filtering Any scores satisfying the linear inequalities give rigorous filtering Proof: M Viterbi path score corresponding HMM path score Viterbi HMM path score (even if it does not correspond to any M path) P AA L A + R A P A L A + R P A L A + R P AU L A + R U P A L A + R 86
50 Minimizing E(L i, R i ) (subject to linear constraints) alculate E(L i, R i ) symbolically, in terms of emission scores, so we can do partial derivatives for numerical convex optimization algorithm Forward: Viterbi: E(L 1, L 2,...) L i 90
51 Assignment of scores/ probabilities onvex optimization problem onstraints: enforce rigorous property Objective function: filter as aggressively as possible Problem sizes: variables inequality constraints 91
52 onvex Optimization onvex: local max = global max; simple hill climbing works Nonconvex: can be many local maxima, global max; hill-climbing fails 92
53 Estimated Filtering Efficiency (139 Rfam 4.0 families) Filtering fraction # families (compact) # families (expanded) break even < Averages 283 times faster than M! ~100x speedup 93
54 Motif Discovery
55 RNA Motif Discovery Would be great if: given 100 complete genomes from diverse species, we could automatically find all the RNAs. State of the art: that s hopeless Hope: can we exploit biological knowledge to narrow the search space? 130
56 RNA Motif Discovery More promising problem: given a unaligned sequences of a few kb, most of which contain instances of one RNA motif of bp -- find it. Example: 5 UTRs of orthologous glycine cleavage genes from γ-proteobacteria Example: corresponding introns of orthogolous vertebrate genes Orthologs = counterparts in different species 131
57 Approaches Align-First: Align sequences, then look for common structure Fold-First: Predict structures, then try to align them Joint: Do both together 132
58 Pitfall for sequence-alignment- first approach Structural conservation Sequence conservation Alignment without structure information is unreliable LUSTALW alignment of SEIS elements with flanking regions same-colored boxes should be aligned 134
59 Approaches Align-first: align sequences, then look for common structure Fold-first: Predict structures, then try to align them single-seq struct prediction only ~ 60% accurate; exacerbated by flanking seq; no biologicallyvalidated model for structural alignment Joint: Do both together Sankoff good but slow Heuristic 139
60 Our Approach: Mfinder Simultaneous local alignment, folding and Mbased motif description using an EM-style learning procedure Yao, Weinberg & Ruzzo, Bioinformatics,
61 MFinder Simultaneous alignment, folding & motif description Yao, Weinberg & Ruzzo, Bioinformatics, 2006 Folding predictions ombines folding & mutual information in a principled way. Smart heuristics andidate alignment M Mutual Information Realign EM 146
62 Mfinder Accuracy (on Rfam families with flanking sequence) /W /W 153
63 Discovery in Bacteria 156
64 Right Data: Why/How We can recognize, say, 5-10 good examples amidst 20 extraneous ones (but not 5 in 200 or 2000) of length 1k or 10k (but not 100k) Regulators often near regulatees (protein coding genes), which are usually recognizable cross-species So, look near similar genes ( homologs ) Many riboswitches, e.g., are present in ~5 copies per genome (Not strategy used in vertebrates x larger genomes) 164
65 Processing Times Identify DD group members 2946 DD groups Retrieve upstream sequences < 10 PU days Footprinter ranking < 10 PU days Input from ~70 complete Firmicute genomes available in late 2005-early 2006, totaling ~200 megabases Mfinder motifs Motif postprocessing 1740 motifs RaveNnA Mfinder refinement 1 ~ 2 PU months 10 PU months < 1 PU month Motif postprocessing 1466 motifs 179
66 Table 1: Motifs that correspond to Rfam families Rank Score # DD Rfam RAV MF FP RAV MF ID ene Description IlvB Thiamine pyrophosphate-requiring enzymes RF00230 T-box O3859 Predicted membrane protein RF00059 THI MetH Methionine synthase I specific DNA methylase RF00162 S_box O0116 Predicted N6-adenine-specific DNA methylase RF00011 RNaseP_bact_b DHBP 3,4-dihydroxy-2-butanone 4-phosphate synthase RF00050 RFN uaa MP synthase RF00167 Purine cvp lycine cleavage system protein P RF00504 lycine DUF149 Uncharacterised BR, YbaB family O0718 RF00169 SRP_bact bib obalamin biosynthesis protein obd/bib RF00174 obalamin LysA Diaminopimelate decarboxylase RF00168 Lysine Ter Membrane protein Ter RF00080 yybp-ykoy MgtE Mg/o/Ni transporter MgtE RF00380 ykok lms lucosamine 6-phosphate synthetase RF00234 glms OpuBB AB-type proline/glycine betaine transport systems RF00005 trna EmrE Membrane transporters of cations and cationic drug RF00442 ykk-yxkd O0398 Uncharacterized conserved protein RF00023 tmrna Table 1: Motifs that correspond to Rfam families. Rank : the three columns show ranks for refined motif clusters after genome scans ( RAV ), Mfinder motifs before genome scans ( MF ), and FootPrinter results ( FP ). We used the same ranking scheme for RAV and MF. Score 180
67 Rfam Membership Overlap Structure # Sn Sp nt Sn Sp bp Sn Sp RF00174 obalamin RF00504 lycine RF00234 glms RF00168 Lysine RF00167 Purine RF00050 RFN RF00011 RNaseP_bact_b RF00162 S_box RF00169 SRP_bact RF00230 T-box RF00059 THI RF00442 ykk-yxkd RF00380 ykok RF00080 yybp-ykoy mean median Tbl 2: Prediction accuracy compared to prokaryotic subset of Rfam full alignments. Membership: # of seqs in overlap between our predictions and Rfam s, the sensitivity (Sn) and specificity (Sp) of our membership predictions. Overlap: the avg len of overlap between our predictions and Rfam s (nt), the fractional lengths of the overlapped region in Rfam s predictions (Sn) and in ours (Sp). Structure: the avg # of correctly predicted canonical base pairs (in overlapped regions) in the secondary structure (bp), and sensitivity and specificity of our predictions. 1 After 2nd RaveNnA scan, membership Sn of lycine, obalamin increased to 76% and 98% resp., lycine Sp unchanged, but obalamin Sp dropped to 84%. 184
68 Example: Ribosomal Autoregulation: Excess L19 represses L19 (RF00556; similar) A B P2 L19 (rpls) mrna leader Y U A U A R R Y U R R U 5' Y 3' P1 N N N TSS nucleotide identity nucleotide present 97% 97% 90% 90% 75% 75% 50% A stem loop always present compensatory mutations compatible mutations Watson-rick base pair other base interaction 5' P1 P2 RBS Start A A U U U U U A A U A U U 3' U U U A A A U A A U A U A A U U U U B. subtilis L19 mrna leader U U U U U L19 U U U A A A A U U U U U A U A 3' U A A U U U A 5' A A A A? 186
69 Examples: 6 (of 22) Representative motifs Sudarsan, et al Science, 2008! EMM! Wang, et al Mol ell, 2008! SAH! suca! boxed = confirmed riboswitch SAM-IV! Weinberg et al RNA 08! Moo! Regulski et al Mol Microbiol 08! Legend! O4708! Weinberg, et al. Nucl. Acids Res., July : ! Meyer, et al RNA, 2008! 191
70 Vertebrate ncrnas Some Results
71 Human Predictions Evofold S Pedersen, Bejerano, A Siepel, K Rosenbloom, K Lindblad-Toh, ES Lander, J Kent, W Miller, D Haussler, "Identification and classification of conserved RNA secondary structures in the human genome." PLoS omput. Biol., 2, #4 (2006) e33. 48,479 candidates (~70% FDR?) RNAz S Washietl, IL Hofacker, M Lukasser, A Hutenhofer, PF Stadler, "Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome." Nat. Biotechnol., 23, #11 (2005) ,000 structured RNA elements 1,000 conserved across all vertebrates. ~1/3 in introns of known genes, ~1/6 in UTRs ~1/2 located far from any known gene FOLDALIN E Torarinsson, M Sawera, JH Havgaard, M Fredholm, J orodkin, "Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure." enome Res., 16, #7 (2006) candidates from (of 100,000) pairs Mfinder Torarinsson, Yao, Wiklund, Bramsen, Hansen, Kjems, Tommerup, Ruzzo and orodkin. omparative genomics beyond sequence based alignments: RNA structures in the ENODE regions. enome Research, Feb 2008, 18(2): PMID: candidates in ENODE alone (better FDR, but still high) Some details below
72 Mfinder Search in Vertebrates Extract ENODE * Multiz alignments Remove exons, most conserved elements blocks, 8.7M bps. Apply Mfinder to both strands. 10,106 predictions, 6,587 clusters. High false positive rate, but still suggests 1000 s of RNAs. (We ve applied Mfinder to whole human genome: many 100 s of PU years. Analysis in progress.) Trust 17-way alignment for orthology, not for detailed alignment * ENODE: deeply annotated 1% of human genome 216
73 Log 10 (ount) enomic Intergap Distances (Human-Mouse) Length enome-wide Identification of Human Functional DNA Using a Neutral Indel Model erton Lunter, hris P. Ponting, Jotun Hein, PLoS omput Biol 2006, 2(1): e5. 220
74 Overlap w/ Indel Purified Segments IPS presumed to signal purifying selection Majority (64%) of candidates have >45% + Strong P-value for their overlap w/ IPS + data P N Expected Observed P-value % 0-35 igs % igs % igs % igs E % igs E % all igs E % 221
75 Realignment % realigned Average pairwise sequence similarity 225
76 Alignment Matters The original MULTIZ alignment without flanking regions. RNAz Score: (no RNA) Human TATTAAAATT-TTTAAAAAAT----TTAAATATAAAAAATAATT himp AATTTAATT-ATTTAAAAAT----ATTAAATATAAAATAAATT ow TATTTAAAATT-ATAAA--AAAAT----TTAATTTAAAAATTAATT Dog TATTTAAAATTTTAATA--AAAAAT----TTAATTTAAAATATTAATT Rabbit ATATTTAAAATTT-TTTTAATAAAAT----TTAATTATAAAATTAAATT Rhesus TATTAAAATT-TTTAAAAAATATTTAAATATAAAAAATAATT Str ((((((...(((((((...(((...)))..))))...)))...))))))...( The local Mfinder re-alignment of the MULTIZ block. RNAz Score: (RNA) Human TATTAAAATT-TTTAAA-A-----AATTTAAATATAAAAAATAA himp AATTTAATT-ATTT-AAA-----AATATTAAATATAAAATAAA ow TATTTAAAATT-ATAAA--AAA------ATTTAATTTAAAAATTAA Dog TATTTAAAATTTTAATA--AAA-A-----ATTTAATTTAAAATATTAA Rabbit ATATTTAAAATTT-TTTT-AATA-----AAATTTAATTATAAAATTAAA Rhesus TATTAAAATT-TTTAAA-AAA-TATTTAAATATAAAAAATAA Str ((((((...((((((((..(((...)))...))))))))...))))))
77 10 of 11 top (differentially) expressed 228
78 ncrna Summary ncrna is a hot topic For family homology modeling: Ms Training & search like HMM (but slower) Dramatic acceleration possible Automated model construction possible New computational methods yield new discoveries Many open problems 247
79 ourse Wrap Up
80 What is DNA? RNA? How many Amino Acids are there? Did human beings, as we know them, develop from earlier species of animals? What are stem cells? What did Viterbi invent? What is dynamic programming? What is a likelihood ratio test? What is the EM algorithm? How would you find the maximum of f(x) = ax 3 + bx 2 + cx +d in the interval -10<x<25? 249
81 Sensors High-Throughput DNA sequencing Microarrays/ene expression Mass Spectrometry/Proteomics BioTech Protein/protein & DNA/protein interaction ontrols loning ene knock out/knock in RNAi Floods of data rand hallenge problems 250
82 Exciting Times Lots to do Highly multidisciplinary You ll be hearing a lot more about it I hope I ve given you a taste of it
83 Thanks!
RNA Search and! Motif Discovery" Genome 541! Intro to Computational! Molecular Biology"
RNA Search and! Motif Discovery" Genome 541! Intro to Computational! Molecular Biology" Day 1" Many biologically interesting roles for RNA" RNA secondary structure prediction" 3 4 Approaches to Structure
More informationSearching for Noncoding RNA
Searching for Noncoding RN Larry Ruzzo omputer Science & Engineering enome Sciences niversity of Washington http://www.cs.washington.edu/homes/ruzzo Bio 2006, Seattle, 8/4/2006 1 Outline Noncoding RN Why
More informationRNA Search and Motif Discovery. Lectures CSE 527 Autumn 2007
RNA Search and Motif Discovery Lectures 18-19 CSE 527 Autumn 2007 The Human Parts List, circa 2001 1 gagcccggcc cgggggacgg gcggcgggat agcgggaccc cggcgcggcg gtgcgcttca 61 gggcgcagcg gcggccgcag accgagcccc
More informationGenome 559 Wi RNA Function, Search, Discovery
Genome 559 Wi 2009 RN Function, Search, Discovery The Message Cells make lots of RN noncoding RN Functionally important, functionally diverse Structurally complex New tools required alignment, discovery,
More informationModeling and Searching for Non-Coding RNA
Modeling and Searching for Non-oding RN W.L. Ruzzo http://www.cs.washington.edu/homes/ruzzo http://www.cs.washington.edu/homes/ruzzo/ courses/gs541/09sp ENOME 541 protein and DN sequence analysis to determine
More informationRNA Search and! Motif Discovery"
RN Search and! Motif Discovery" Lecture 9! SEP 590! utumn 2008" Outline" Whirlwind tour of ncrn search & discovery" RN motif description (ovariance Model Review)" lgorithms for searching" Rigorous & heuristic
More informationModeling and Searching for Non-Coding RNA
Modeling and Searching for Non-oding RN W.L. Ruzzo http://www.cs.washington.edu/homes/ruzzo The entral Dogma DN RN Protein gene Protein DN (chromosome) cell RN (messenger) lassical RNs mrn trn rrn snrn
More informationModeling and Searching for Non-Coding RNA
Modeling and Searching for Non-oding RN W.L. Ruzzo http://www.cs.washington.edu/homes/ruzzo Outline Why RN? Examples of RN biology omputational hallenges Modeling Search Inference The entral Dogma DN RN
More informationThe Double Helix. CSE 417: Algorithms and Computational Complexity! The Central Dogma of Molecular Biology! DNA! RNA! Protein! Protein!
The Double Helix SE 417: lgorithms and omputational omplexity! Winter 29! W. L. Ruzzo! Dynamic Programming, II" RN Folding! http://www.rcsb.org/pdb/explore.do?structureid=1t! Los lamos Science The entral
More informationCSEP 527 Computational Biology. RNA: Function, Secondary Structure Prediction, Search, Discovery
SEP 527 omputational Biology RN: Function, Secondary Structure Prediction, Search, Discovery The Message ells make lots of RN noncoding RN Functionally important, functionally diverse Structurally complex
More informationComparative Gene Finding. BMI/CS 776 Spring 2015 Colin Dewey
Comparative Gene Finding BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2015 Colin Dewey cdewey@biostat.wisc.edu Goals for Lecture the key concepts to understand are the following: using related genomes
More informationRNA Secondary Structure. CSE 417 W.L. Ruzzo
RN Secondary Structure SE 417 W.L. Ruzzo The Double Helix Los lamos Science The entral Dogma of Molecular Biology DN RN Protein gene Protein DN (chromosome) cell RN (messenger) Non-coding RN Messenger
More informationTowards a Comprehensive Annotation of Structured RNAs in Drosophila
Towards a Comprehensive Annotation of Structured RNAs in Drosophila Rebecca Kirsch 31st TBI Winterseminar, Bled 20/02/2016 Studying Non-Coding RNAs in Drosophila Why Drosophila? especially for novel molecules
More informationHidden Markov Models. Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98)
Hidden Markov Models Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98) 1 The occasionally dishonest casino A P A (1) = P A (2) = = 1/6 P A->B = P B->A = 1/10 B P B (1)=0.1... P
More informationCAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan
CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs15.html Describing & Modeling Patterns
More informationA Computational Pipeline for High- Throughput Discovery of cis-regulatory Noncoding RNA in Prokaryotes
A Computational Pipeline for High- Throughput Discovery of cis-regulatory Noncoding RNA in Prokaryotes Zizhen Yao 1*, Jeffrey Barrick 2, Zasha Weinberg 3, Shane Neph 1,4, Ronald Breaker 2,3,5, Martin Tompa
More informationHMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder
HMM applications Applications of HMMs Gene finding Pairwise alignment (pair HMMs) Characterizing protein families (profile HMMs) Predicting membrane proteins, and membrane protein topology Gene finding
More informationPredicting RNA Secondary Structure
7.91 / 7.36 / BE.490 Lecture #6 Mar. 11, 2004 Predicting RNA Secondary Structure Chris Burge Review of Markov Models & DNA Evolution CpG Island HMM The Viterbi Algorithm Real World HMMs Markov Models for
More informationConserved RNA Structures. Ivo L. Hofacker. Institut for Theoretical Chemistry, University Vienna.
onserved RN Structures Ivo L. Hofacker Institut for Theoretical hemistry, University Vienna http://www.tbi.univie.ac.at/~ivo/ Bled, January 2002 Energy Directed Folding Predict structures from sequence
More informationStephen Scott.
1 / 21 sscott@cse.unl.edu 2 / 21 Introduction Designed to model (profile) a multiple alignment of a protein family (e.g., Fig. 5.1) Gives a probabilistic model of the proteins in the family Useful for
More informationComputational Biology: Basics & Interesting Problems
Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 07: profile Hidden Markov Model http://bibiserv.techfak.uni-bielefeld.de/sadr2/databasesearch/hmmer/profilehmm.gif Slides adapted from Dr. Shaojie Zhang
More informationDetecting non-coding RNA in Genomic Sequences
Detecting non-coding RNA in Genomic Sequences I. Overview of ncrnas II. What s specific about RNA detection? III. Looking for known RNAs IV. Looking for unknown RNAs Daniel Gautheret INSERM ERM 206 & Université
More informationMarkov Models & DNA Sequence Evolution
7.91 / 7.36 / BE.490 Lecture #5 Mar. 9, 2004 Markov Models & DNA Sequence Evolution Chris Burge Review of Markov & HMM Models for DNA Markov Models for splice sites Hidden Markov Models - looking under
More informationO 3 O 4 O 5. q 3. q 4. Transition
Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in
More information3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM
I529: Machine Learning in Bioinformatics (Spring 2017) Content HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University,
More informationHMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM
I529: Machine Learning in Bioinformatics (Spring 2017) HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington
More informationBioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment
Bioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment Substitution score matrices, PAM, BLOSUM Needleman-Wunsch algorithm (Global) Smith-Waterman algorithm (Local) BLAST (local, heuristic) E-value
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationHidden Markov Models
Hidden Markov Models Slides revised and adapted to Bioinformática 55 Engª Biomédica/IST 2005 Ana Teresa Freitas Forward Algorithm For Markov chains we calculate the probability of a sequence, P(x) How
More informationCONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018
CONCEPT OF SEQUENCE COMPARISON Natapol Pornputtapong 18 January 2018 SEQUENCE ANALYSIS - A ROSETTA STONE OF LIFE Sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of
More informationToday s Lecture: HMMs
Today s Lecture: HMMs Definitions Examples Probability calculations WDAG Dynamic programming algorithms: Forward Viterbi Parameter estimation Viterbi training 1 Hidden Markov Models Probability models
More informationA Method for Aligning RNA Secondary Structures
Method for ligning RN Secondary Structures Jason T. L. Wang New Jersey Institute of Technology J Liu, JTL Wang, J Hu and B Tian, BM Bioinformatics, 2005 1 Outline Introduction Structural alignment of RN
More informationHidden Markov Models and Their Applications in Biological Sequence Analysis
Hidden Markov Models and Their Applications in Biological Sequence Analysis Byung-Jun Yoon Dept. of Electrical & Computer Engineering Texas A&M University, College Station, TX 77843-3128, USA Abstract
More informationPredicting RNA Secondary Structure Using Profile Stochastic Context-Free Grammars and Phylogenic Analysis
Fang XY, Luo ZG, Wang ZH. Predicting RNA secondary structure using profile stochastic context-free grammars and phylogenic analysis. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 23(4): 582 589 July 2008
More informationEarly History up to Schedule. Proteins DNA & RNA Schwann and Schleiden Cell Theory Charles Darwin publishes Origin of Species
Schedule Bioinformatics and Computational Biology: History and Biological Background (JH) 0.0 he Parsimony criterion GKN.0 Stochastic Models of Sequence Evolution GKN 7.0 he Likelihood criterion GKN 0.0
More informationModel Accuracy Measures
Model Accuracy Measures Master in Bioinformatics UPF 2017-2018 Eduardo Eyras Computational Genomics Pompeu Fabra University - ICREA Barcelona, Spain Variables What we can measure (attributes) Hypotheses
More informationAn Introduction to Bioinformatics Algorithms Hidden Markov Models
Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More informationIn Genomes, Two Types of Genes
In Genomes, Two Types of Genes Protein-coding: [Start codon] [codon 1] [codon 2] [ ] [Stop codon] + DNA codons translated to amino acids to form a protein Non-coding RNAs (NcRNAs) No consistent patterns
More informationMATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME
MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2014 1 HMM Lecture Notes Dannie Durand and Rose Hoberman November 6th Introduction In the last few lectures, we have focused on three problems related
More information10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison
10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:
More informationHidden Markov Models
Hidden Markov Models A selection of slides taken from the following: Chris Bystroff Protein Folding Initiation Site Motifs Iosif Vaisman Bioinformatics and Gene Discovery Colin Cherry Hidden Markov Models
More informationEvolutionary Models. Evolutionary Models
Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment
More informationHMMs and biological sequence analysis
HMMs and biological sequence analysis Hidden Markov Model A Markov chain is a sequence of random variables X 1, X 2, X 3,... That has the property that the value of the current state depends only on the
More informationStatistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic
More informationSearching genomes for non-coding RNA using FastR
Searching genomes for non-coding RNA using FastR Shaojie Zhang Brian Haas Eleazar Eskin Vineet Bafna Keywords: non-coding RNA, database search, filtration, riboswitch, bacterial genome. Address for correspondence:
More informationUsing Ensembles of Hidden Markov Models for Grand Challenges in Bioinformatics
Using Ensembles of Hidden Markov Models for Grand Challenges in Bioinformatics Tandy Warnow Founder Professor of Engineering The University of Illinois at Urbana-Champaign http://tandy.cs.illinois.edu
More informationNewly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:
m Eukaryotic mrna processing Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: Cap structure a modified guanine base is added to the 5 end. Poly-A tail
More informationCISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I)
CISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I) Contents Alignment algorithms Needleman-Wunsch (global alignment) Smith-Waterman (local alignment) Heuristic algorithms FASTA BLAST
More informationBias in RNA sequencing and what to do about it
Bias in RNA sequencing and what to do about it Walter L. (Larry) Ruzzo Computer Science and Engineering Genome Sciences University of Washington Fred Hutchinson Cancer Research Center Seattle, WA, USA
More informationDe novo prediction of structural noncoding RNAs
1/ 38 De novo prediction of structural noncoding RNAs Stefan Washietl 18.417 - Fall 2011 2/ 38 Outline Motivation: Biological importance of (noncoding) RNAs Algorithms to predict structural noncoding RNAs
More informationInDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9
Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri RNA Structure Prediction Secondary
More informationHidden Markov Models
Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More informationProtein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.
Protein Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet rickard.sandberg@ki.se sandberg.cmb.ki.se Outline Protein features motifs patterns profiles signals 2 Protein
More information98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006
98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006 8.3.1 Simple energy minimization Maximizing the number of base pairs as described above does not lead to good structure predictions.
More informationAlgorithms in Computational Biology (236522) spring 2008 Lecture #1
Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: 15:30-16:30/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office hours:??
More informationRNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17
RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 Dr. Stefan Simm, 01.11.2016 simm@bio.uni-frankfurt.de RNA secondary structures a. hairpin loop b. stem c. bulge loop d. interior loop e. multi
More informationBio 1B Lecture Outline (please print and bring along) Fall, 2007
Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution
More informationModule: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment
Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Introduction to Bioinformatics online course : IBT Jonathan Kayondo Learning Objectives Understand
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationMultiple Sequence Alignment using Profile HMM
Multiple Sequence Alignment using Profile HMM. based on Chapter 5 and Section 6.5 from Biological Sequence Analysis by R. Durbin et al., 1998 Acknowledgements: M.Sc. students Beatrice Miron, Oana Răţoi,
More informationLearning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling
Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 009 Mark Craven craven@biostat.wisc.edu Sequence Motifs what is a sequence
More informationIntroduction to Hidden Markov Models for Gene Prediction ECE-S690
Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Outline Markov Models The Hidden Part How can we use this for gene prediction? Learning Models Want to recognize patterns (e.g. sequence
More informationThe wonderful world of RNA informatics
December 9, 2012 Course Goals Familiarize you with the challenges involved in RNA informatics. Introduce commonly used tools, and provide an intuition for how they work. Give you the background and confidence
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison
CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture
More informationHidden Markov models in population genetics and evolutionary biology
Hidden Markov models in population genetics and evolutionary biology Gerton Lunter Wellcome Trust Centre for Human Genetics Oxford, UK April 29, 2013 Topics for today Markov chains Hidden Markov models
More informationStatistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD William and Nancy Thompson Missouri Distinguished Professor Department
More information(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.
1. A change that makes a polypeptide defective has been discovered in its amino acid sequence. The normal and defective amino acid sequences are shown below. Researchers are attempting to reproduce the
More informationIntroduction to Evolutionary Concepts
Introduction to Evolutionary Concepts and VMD/MultiSeq - Part I Zaida (Zan) Luthey-Schulten Dept. Chemistry, Beckman Institute, Biophysics, Institute of Genomics Biology, & Physics NIH Workshop 2009 VMD/MultiSeq
More information6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationBioinformatics Chapter 1. Introduction
Bioinformatics Chapter 1. Introduction Outline! Biological Data in Digital Symbol Sequences! Genomes Diversity, Size, and Structure! Proteins and Proteomes! On the Information Content of Biological Sequences!
More informationGenomics and bioinformatics summary. Finding genes -- computer searches
Genomics and bioinformatics summary 1. Gene finding: computer searches, cdnas, ESTs, 2. Microarrays 3. Use BLAST to find homologous sequences 4. Multiple sequence alignments (MSAs) 5. Trees quantify sequence
More informationWhole Genome Alignments and Synteny Maps
Whole Genome Alignments and Synteny Maps IINTRODUCTION It was not until closely related organism genomes have been sequenced that people start to think about aligning genomes and chromosomes instead of
More informationBMI/CS 576 Fall 2016 Final Exam
BMI/CS 576 all 2016 inal Exam Prof. Colin Dewey Saturday, December 17th, 2016 10:05am-12:05pm Name: KEY Write your answers on these pages and show your work. You may use the back sides of pages as necessary.
More informationHomology Modeling. Roberto Lins EPFL - summer semester 2005
Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,
More informationBLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010
BLAST Database Searching BME 110: CompBio Tools Todd Lowe April 8, 2010 Admin Reading: Read chapter 7, and the NCBI Blast Guide and tutorial http://www.ncbi.nlm.nih.gov/blast/why.shtml Read Chapter 8 for
More informationSupplementary Information for Discovery and characterization of indel and point mutations
Supplementary Information for Discovery and characterization of indel and point mutations using DeNovoGear Avinash Ramu 1 Michiel J. Noordam 1 Rachel S. Schwartz 2 Arthur Wuster 3 Matthew E. Hurles 3 Reed
More informationData Mining in Bioinformatics HMM
Data Mining in Bioinformatics HMM Microarray Problem: Major Objective n Major Objective: Discover a comprehensive theory of life s organization at the molecular level 2 1 Data Mining in Bioinformatics
More informationPage 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence
Page Hidden Markov models and multiple sequence alignment Russ B Altman BMI 4 CS 74 Some slides borrowed from Scott C Schmidler (BMI graduate student) References Bioinformatics Classic: Krogh et al (994)
More informationMarkov Chains and Hidden Markov Models. = stochastic, generative models
Markov Chains and Hidden Markov Models = stochastic, generative models (Drawing heavily from Durbin et al., Biological Sequence Analysis) BCH339N Systems Biology / Bioinformatics Spring 2016 Edward Marcotte,
More information6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, etworks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationNeural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha Outline Goal is to predict secondary structure of a protein from its sequence Artificial Neural Network used for this
More informationAlignment & BLAST. By: Hadi Mozafari KUMS
Alignment & BLAST By: Hadi Mozafari KUMS SIMILARITY - ALIGNMENT Comparison of primary DNA or protein sequences to other primary or secondary sequences Expecting that the function of the similar sequence
More informationBLAST. Varieties of BLAST
BLAST Basic Local Alignment Search Tool (1990) Altschul, Gish, Miller, Myers, & Lipman Uses short-cuts or heuristics to improve search speed Like speed-reading, does not examine every nucleotide of database
More informationComputational approaches for functional genomics
Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding
More informationMatrix-based pattern discovery algorithms
Regulatory Sequence Analysis Matrix-based pattern discovery algorithms Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe)
More informationHidden Markov Models
Andrea Passerini passerini@disi.unitn.it Statistical relational learning The aim Modeling temporal sequences Model signals which vary over time (e.g. speech) Two alternatives: deterministic models directly
More informationGenome 373: Hidden Markov Models II. Doug Fowler
Genome 373: Hidden Markov Models II Doug Fowler Review From Hidden Markov Models I What does a Markov model describe? Review From Hidden Markov Models I A T A Markov model describes a random process of
More informationLecture 8 Learning Sequence Motif Models Using Expectation Maximization (EM) Colin Dewey February 14, 2008
Lecture 8 Learning Sequence Motif Models Using Expectation Maximization (EM) Colin Dewey February 14, 2008 1 Sequence Motifs what is a sequence motif? a sequence pattern of biological significance typically
More informationAlignment Algorithms. Alignment Algorithms
Midterm Results Big improvement over scores from the previous two years. Since this class grade is based on the previous years curve, that means this class will get higher grades than the previous years.
More informationIMPROVEMENT OF STRUCTURE CONSERVATION INDEX WITH CENTROID ESTIMATORS
1 IMPROVEMENT OF STRUCTURE CONSERVATION INDEX WITH CENTROID ESTIMATORS YOHEI OKADA Department of Biosciences and Informatics, Keio University, 3 14 1 Hiyoshi, Kohoku-ku, Yokohama, Kanagawa 223 8522, Japan
More informationSequence Alignment Techniques and Their Uses
Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this
More informationSequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University
Sequence Alignment: A General Overview COMP 571 - Fall 2010 Luay Nakhleh, Rice University Life through Evolution All living organisms are related to each other through evolution This means: any pair of
More informationA graph kernel approach to the identification and characterisation of structured non-coding RNAs using multiple sequence alignment information
graph kernel approach to the identification and characterisation of structured noncoding RNs using multiple sequence alignment information Mariam lshaikh lbert Ludwigs niversity Freiburg, Department of
More information13 Comparative RNA analysis
13 Comparative RNA analysis Sources for this lecture: R. Durbin, S. Eddy, A. Krogh und G. Mitchison, Biological sequence analysis, Cambridge, 1998 D.W. Mount. Bioinformatics: Sequences and Genome analysis,
More informationComputational methods for predicting protein-protein interactions
Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational
More informationTiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1
Tiffany Samaroo MB&B 452a December 8, 2003 Take Home Final Topic 1 Prior to 1970, protein and DNA sequence alignment was limited to visual comparison. This was a very tedious process; even proteins with
More informationCSE 527 Phylogeny & RNA: Pfold. Lectures Autumn 2006
CSE 527 Phylogeny & RNA: Pfold Lectures 20-21 Autumn 2006 Phylogenies (aka Evolutionary Trees) Nothing in biology makes sense, except in the light of evolution -- Dobzhansky Modeling Sequence Evolution
More information