Pattern matching in highly similar sequences
|
|
- Rafe Little
- 5 years ago
- Views:
Transcription
1 Pattern matching in highly similar sequences Thierry Lecroq joint work with N. Ben Nsira et É. Prieur-Gaston Laboratoire d Informatique, du Traitement de l Information et des Systèmes (LITIS EA4108) Université de Rouen Normandie, France Mathematics / Computer Science Day 12 October 2018 Rouen France
2 Outline 1 Bioinformatics 2 Examples of projects 3 Search in highly similar sequences Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
3 Outline 1 Bioinformatics 2 Examples of projects 3 Search in highly similar sequences Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
4 Bioinformaticians Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
5 Bioinformatics (in silico biology) Extrait de Wikipédia, mai 2015 La bio-informatique est constituée par l ensemble des concepts et des techniques nécessaires à l interprétation informatique de l information biologique. Plusieurs champs d application ou sous-disciplines de la bio-informatique se sont constitués : La bio-informatique des séquences, [...] La bio-informatique structurale, [...] La bio informatique des réseaux, [...] La bio-informatique statistique et la bio-informatique des populations. Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
6 Genetics Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
7 Size of genomes (Mb) Escherichia coli (bacteria) 4,6 Saccharomyces cerevisiae (yeast) 13 C. elegans (worm) 100 Arabidopsis thaliana (plant) 125 Drosophila melanogaster (fly) 180 Rice 400 Homo sapiens 3300 Fern Amoeba dubia (amoeba) Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
8 From Biology to Computer Science Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
9 Sequencing before 2005: slow and expensive since 2005: NGS, faster and faster, cheaper and cheaper Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
10 Data deluge Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
11 Next Generation Sequencing (NGS) Sequencer Reads millions of short (length 150) fragments called reads 2 types of projects : alignment (mapping) on a reference genome assembly for building a genome Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
12 Mapping Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
13 Resequencing reference genome Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
14 Resequencing sequenced genome reads Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
15 Single Nucleotide Polymorphism (SNP) or Single Nucleotide Variant (SNV) reference genome sequenced genome reads x x x x x... Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
16 Sequencing errors Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
17 Sequencing errors reference genome sequenced genome reads x... Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
18 Outline 1 Bioinformatics 2 Examples of projects 3 Search in highly similar sequences Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
19 Book of exercises (with solutions) Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
20 Example Zimin Z 1 = x 1 and ( k > 1) Z k = Z k 1 x k Z k 1 Example: abacabadabacaba Question Give a linear time algorithm for computing all Zimin type prefixes of a given string Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
21 New location Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
22 mirabel bioinfo.univ-rouen.fr/ mirabel Aggregation of software for prediction of microrna targets Collaboration with INSERM U982 DC2N and LMRS UMR 6085 CNRS A. Quillet, C. Saad, G. Ferry, Y. Anouar, N Vergne, T. Lecroq and C. Dubessy Improving bioinformatics prediction of microrna targets by ranks aggregation biorxiv , 2017 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
23 Phaeodactylum tricornutum Differential transcriptomic analysis (RNA-seq) of an algae with 3 phenotypes Collaboration with GlycoMEV EA 4358 and LMRS UMR 6085 C. Ovide, M.-C. Kiefer-Meyer, C. Be rard, N. Vergne, T. Lecroq, C. Plasson, C. Burel, S. Bernard, A. Driouich, P. Lerouge, I. Tournier, H. Dauchel and M. Bardor Comparative in depth RNA sequencing of P. tricornutum s morphotypes reveals specific features of the oval morphotype Scientific Reports, 8, Article number: 14340, 2018 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
24 Third generation read correction Hybrid correction Self-correction tool for evaluate correctors PhD Thesis of Pierre Morisse (grant from Univ. Rouen Normandie 2016) P. Morisse, T. Lecroq and A. Lefebvre Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph Bioinformatics, 2018, accepted Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
25 UMI (Unique Marker Identifier) Conception of new algorithms for processing UMIs from NGS data PhD Thesis of Ahmad Abdel Sater (grant from Normandie Region 2018) Collaboration with INSERM U1245 at CRLCC Henri Becquerel P.-J. Viailly, E. Bohers, M. Viennot, A. Abdel Sater, V. Marchand, P. Ruminy, H. Dauchel, T. Lecroq, M. Becker, P. Etancelin, H. Tilly, P. Vera and F. Jardin I-LowVarFreq: improving low-frequency variant detection using a new UMI-based variant calling approach for paired-end sequencing NGS libraries Proceedings of JOBIM, Marseilles, France, 2018 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
26 Mass spectrometry data Study of Pemphigus Collaboration with INSERM U1234 Reconstruction of peptides H T W Y Q K K P N A A P R Q Q K P N A A P R L L L Y M. Petit, M.-L. Walet-Balieu, P. Chan Tchi Song, L. Drouot, C. Burel, M. Maho-Vaillant, T. Lecroq, P. Cosette, D. Vaudry, O. Boyer, M. Bardor, P. Joly, S. Calbo Longitudinal study of anti-dsg3 IgG repertoire by proteomics in Pemphigus following Rituximab treatment Poster, JNRb, Rouen, France, 2018 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
27 Outline 1 Bioinformatics 2 Examples of projects 3 Search in highly similar sequences Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
28 Pattern matching Find one(all the) position(s) of a pattern of length m in a sequence of length n: with index O(m) without index O(n) Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
29 Suffix Trie a t $ g y = a t a t g a t $ a $ t g a 0 atatgat$ a $ 6 1 tatgat$ g t a t 2 atgat$ 5 t a g t $ 3 tgat$ 4 4 gat$ g t a $ 3 5 at$ a $ t 6 t$ 2 7 $ t $ 1 $ 0 7 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
30 Suffix Trie a t $ g y = a t a t g a t $ a $ t g a 0 atatgat$ a $ 6 1 tatgat$ g t a t 2 atgat$ 5 t a g t $ 3 tgat$ 4 4 gat$ g t a $ 3 5 at$ a $ t 6 t$ 2 7 $ t $ 1 $ 0 7 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
31 Suffix Trie a t $ g y = a t a t g a t $ a $ t g a 0 atatgat$ a $ 6 1 tatgat$ g t a t 2 atgat$ 5 t a g t $ 3 tgat$ 4 4 gat$ g t a $ 3 5 at$ a $ t 6 t$ 2 7 $ t $ 1 $ 0 7 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
32 Suffix Tree a t $ g a $ t g a a $ 6 g t a t 5 t a g t $ 4 g t a $ 3 a $ t 2 t $ 1 $ $ 5 gat$ atgat$ at 6 gat$ gat$ atgat$ t $ $ 7 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
33 Suffix Tree a t $ g a $ t g a a $ 6 g t a t 5 t a g t $ 4 g t a $ 3 a $ t 2 t $ 1 $ $ 5 gat$ atgat$ at 6 gat$ gat$ atgat$ t $ $ 7 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
34 Suffix Tree a t a t g a t $ at t $ gat$ atgat$$ atgat$ $ gat$ gat$ (0,2) (1,1) (7,1) (4,4) (7,1) (7,1) (2,6) (2,6) (4,4) (4,4) [Weiner 73,McCreight 76,Ukkonen 92,Farach 97] Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
35 Suffix Tree a t a t g a t $ at t $ gat$ atgat$$ atgat$ $ gat$ gat$ (0,2) (1,1) (7,1) (4,4) (7,1) (7,1) (2,6) (2,6) (4,4) (4,4) [Weiner 73,McCreight 76,Ukkonen 92,Farach 97] Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
36 Suffix Tree a t a t g a t $ at t $ gat$ atgat$$ atgat$ $ gat$ gat$ (0,2) (1,1) (7,1) (4,4) (7,1) (7,1) (2,6) (2,6) (4,4) (4,4) [Weiner 73,McCreight 76,Ukkonen 92,Farach 97] Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
37 Suffix Tree a t a t g a t $ at t $ gat$ atgat$$ atgat$ $ gat$ gat$ (0,2) (1,1) (7,1) (4,4) (7,1) (7,1) (2,6) (2,6) (4,4) (4,4) [Weiner 73,McCreight 76,Ukkonen 92,Farach 97] Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
38 Suffix Tree a t a t g a t $ 0 (0,2) (1,1) (7,1) (4,4) (7,1) (7,1) (2,6) (2,6) (4,4) (4,4) ta is a factor of y tt is not at occurs 3 times at positions 0, 2 and 5 t occurs 3 times at positions 1, 3 and 6 the length of the longest common prefix of suffixes starting at positions 2 and 5 is 2 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
39 Suffix Tree a t a t g a t $ 0 (0,2) (1,1) (7,1) (4,4) (7,1) (7,1) (2,6) (2,6) (4,4) (4,4) ta is a factor of y tt is not at occurs 3 times at positions 0, 2 and 5 t occurs 3 times at positions 1, 3 and 6 the length of the longest common prefix of suffixes starting at positions 2 and 5 is 2 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
40 Suffix Tree a t a t g a t $ 0 (0,2) (1,1) (7,1) (4,4) (7,1) (7,1) (2,6) (2,6) (4,4) (4,4) ta is a factor of y tt is not at occurs 3 times at positions 0, 2 and 5 t occurs 3 times at positions 1, 3 and 6 the length of the longest common prefix of suffixes starting at positions 2 and 5 is 2 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
41 Suffix Tree a t a t g a t $ 0 (0,2) (1,1) (7,1) (4,4) (7,1) (7,1) (2,6) (2,6) (4,4) (4,4) ta is a factor of y tt is not at occurs 3 times at positions 0, 2 and 5 t occurs 3 times at positions 1, 3 and 6 the length of the longest common prefix of suffixes starting at positions 2 and 5 is 2 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
42 Suffix Tree a t a t g a t $ 0 (0,2) (1,1) (7,1) (4,4) (7,1) (7,1) (2,6) (2,6) (4,4) (4,4) ta is a factor of y tt is not at occurs 3 times at positions 0, 2 and 5 t occurs 3 times at positions 1, 3 and 6 the length of the longest common prefix of suffixes starting at positions 2 and 5 is 2 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
43 Suffix Tree a t a t g a t $ 0 (0,2) (1,1) (7,1) (4,4) (7,1) (7,1) (2,6) (2,6) (4,4) (4,4) ta is a factor of y tt is not at occurs 3 times at positions 0, 2 and 5 t occurs 3 times at positions 1, 3 and 6 the length of the longest common prefix of suffixes starting at positions 2 and 5 is 2 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
44 Complexities Algorithms for building suffix trees are: on-line linear time linear space Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
45 Highly similar sequences r sequences y 0 y 1 y 2 y 3 A T G C T A G C A A G A T A C A G A T G C T A G C A A C A T A C A G A T G C G A G C A A G A T A C A G A T G C T A G C A A C A T A C A T Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
46 Highly similar sequences r sequences y 0 y 1 y 2 y A T G C T A G C A A G A T A C A G A T G C T A G C A A C A T A C A G A T G C G A G C A A G A T A C A G A T G C T A G C A A C A T A C A T y A T G C {G, T} A G C A A {C, G} A T A C A {G, T} Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
47 Highly similar sequences r sequences y 0 y 1 y 2 y A T G C T A G C A A G A T A C A G A T G C T A G C A A C A T A C A G A T G C G A G C A A G A T A C A G A T G C T A G C A A C A T A C A T y A T G C {G, T} A G C A A {C, G} A T A C A {G, T} G A G C A A C Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
48 Highly similar sequences r sequences y 0 y 1 y 2 y 3 A T G C T A G C A A G A T A C A G A T G C T A G C A A C A T A C A G A T G C G A G C A A G A T A C A G A T G C T A G C A A C A T A C A T y 0 et Z = (({2}, 4, G), ({1, 3}, 10, C), ({3}), 16, T) Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
49 Sliding window n y x m y x y x Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
50 Knuth-Morris-Pratt algorithm (1977) y j u b x u a z c Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
51 Boyer-Moore Algorithm comparisons x a z y b z x c z Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
52 For highly similar sequences Hamming distance For u, v A such that u = v : Ham(u, v) = {i u[i] v[i]} Longest Common Extension For x A and 0 i j x 1: LCE k x(i, j) = max{l Ham(x[i.. i + l 1], x[j.. j + l 1]) k} Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
53 Kangaroo jumps LCE k x(i, j) can be computed in O(k) time after O(n) preprocessing time Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
54 Kangaroo jumps i j LCE k x(i, j) can be computed in O(k) time after O(n) preprocessing time Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
55 Kangaroo jumps i j LCE k x(i, j) can be computed in O(k) time after O(n) preprocessing time Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
56 Kangaroo jumps i j 1 LCE k x(i, j) can be computed in O(k) time after O(n) preprocessing time Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
57 Kangaroo jumps i j 1 2 LCE k x(i, j) can be computed in O(k) time after O(n) preprocessing time Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
58 Kangaroo jumps i j LCE k x(i, j) can be computed in O(k) time after O(n) preprocessing time Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
59 Kangaroo jumps i j LCE k x(i, j) can be computed in O(k) time after O(n) preprocessing time Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
60 References Restriction: 1 variation on a window of size m N. Ben Nsira, T. Lecroq and M. Elloumi A fast Boyer-Moore type pattern matching algorithm for highly similar sequences International Journal of Data Mining and Bioinformatics 13(3) (2015) N. Ben Nsira, T. Lecroq and M. Elloumi On-line String Matching in Highly Similar DNA Sequences Mathematics in Computer Science 11(2) (2017) Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
61 2 variants searching for a finite set of patterns relaxing the restriction from 1 to k variations on a window of size m Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
62 Single pattern with at most k variations Applying the Landau-Vishkin algorithm as a filter Searching with k mismatches in O(kn) When Ham(x, y 0 [j.. j + l 1]) = l k l = 0: an exact occurrence of the pattern has been found in y 0 and all the other sequence that do not have a variation comparing to y 0 between position j and position j + m 1 both included. l > 0: let W = {i 0,..., i l 1 } be the set of the l positions such that y 0 [j + i p ] x[i p ] with 0 p < l. Then x occurs exactly in y h if: (G, j + ip, x[i p ]) Z with g G for all 0 p < l; (G, h, c) Z such that h W. Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
63 Single pattern with at most k variations r = 2 and k = y 0 A C C T A C G A C T A x C T A C T T x C T A C T T y 1 A C C T A C T A C T T Our solution runs in time O(knr) Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
64 Single pattern with at most k variations r = 2 and k = y 0 A C C T A C G A C T A x C T A C T T x C T A C T T y 1 A C C T A C T A C T T Our solution runs in time O(knr) Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
65 Multiple patterns with at most 1 variation Build a classical trie of the patterns Scan the highly similar sequences with at most 2 active states Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
66 Multiple patterns with at most 1 variation X = {ACGA, ACTA, CTA} and r = 2 séquences Σ \ {A, C} A 0 1 C 2 G 3 A 4 {ACGA} T C 5 A 6 {ACTA, CTA} 7 T 8 A 9 {CTA} A C C T A C G A C T A y 1 T T active states Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
67 Multiple patterns with at most 1 variation Our solution runs in time O(n) for the searching phase and in time O(s) for the preprocessing phase where s = x for all x X Experiments on similar sequences of different lengths with patterns of length EDSM LVsim ACsim 0.6 Time(s) x x x x10 6 length Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
68 References N. Ben Nsira, T. Lecroq and É. Prieur-Gaston Practical fast exact pattern matching algorithm for highly similar sequences International Conference on Bioinformatics and Biomedicine (BIBM), 2018, Madrid, submitted Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
69 Perspectives Adapt other pattern matching techniques Relax the restrictions Adaptive analysis Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
70 Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
71 Thank you for your attention! Thierry Lecroq (LITIS, URN) Matching in similar sequences 12th October / 50
On-line String Matching in Highly Similar DNA Sequences
On-line String Matching in Highly Similar DNA Sequences Nadia Ben Nsira 1,2,ThierryLecroq 1,,MouradElloumi 2 1 LITIS EA 4108, Normastic FR3638, University of Rouen, France 2 LaTICE, University of Tunis
More informationCGS 5991 (2 Credits) Bioinformatics Tools
CAP 5991 (3 Credits) Introduction to Bioinformatics CGS 5991 (2 Credits) Bioinformatics Tools Giri Narasimhan 8/26/03 CAP/CGS 5991: Lecture 1 1 Course Schedules CAP 5991 (3 credit) will meet every Tue
More informationThree new strategies for exact string matching
Three new strategies for exact string matching Simone Faro 1 Thierry Lecroq 2 1 University of Catania, Italy 2 University of Rouen, LITIS EA 4108, France SeqBio 2012 November 26th-27th 2012 Marne-la-Vallée,
More informationText Searching. Thierry Lecroq Laboratoire d Informatique, du Traitement de l Information et des
Text Searching Thierry Lecroq Thierry.Lecroq@univ-rouen.fr Laboratoire d Informatique, du Traitement de l Information et des Systèmes. International PhD School in Formal Languages and Applications Tarragona,
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Evaluation. Course Homepage.
CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 389; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs06.html 1/12/06 CAP5510/CGS5166 1 Evaluation
More informationIntroduction to Bioinformatics. Shifra Ben-Dor Irit Orr
Introduction to Bioinformatics Shifra Ben-Dor Irit Orr Lecture Outline: Technical Course Items Introduction to Bioinformatics Introduction to Databases This week and next week What is bioinformatics? A
More informationA Multiple Sliding Windows Approach to Speed Up String Matching Algorithms
A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms Simone Faro Thierry Lecroq University of Catania, Italy University of Rouen, LITIS EA 4108, France Symposium on Eperimental Algorithms
More informationComputational Structural Bioinformatics
Computational Structural Bioinformatics ECS129 Instructor: Patrice Koehl http://koehllab.genomecenter.ucdavis.edu/teaching/ecs129 koehl@cs.ucdavis.edu Learning curve Math / CS Biology/ Chemistry Pre-requisite
More informationPattern Matching. a b a c a a b. a b a c a b. a b a c a b. Pattern Matching 1
Pattern Matching a b a c a a b 1 4 3 2 Pattern Matching 1 Outline and Reading Strings ( 9.1.1) Pattern matching algorithms Brute-force algorithm ( 9.1.2) Boyer-Moore algorithm ( 9.1.3) Knuth-Morris-Pratt
More informationOnline Computation of Abelian Runs
Online Computation of Abelian Runs Gabriele Fici 1, Thierry Lecroq 2, Arnaud Lefebvre 2, and Élise Prieur-Gaston2 1 Dipartimento di Matematica e Informatica, Università di Palermo, Italy Gabriele.Fici@unipa.it
More informationPyrobayes: an improved base caller for SNP discovery in pyrosequences
Pyrobayes: an improved base caller for SNP discovery in pyrosequences Aaron R Quinlan, Donald A Stewart, Michael P Strömberg & Gábor T Marth Supplementary figures and text: Supplementary Figure 1. The
More informationWelcome to BIOL 572: Recombinant DNA techniques
Lecture 1: 1 Welcome to BIOL 572: Recombinant DNA techniques Agenda 1: Introduce yourselves Agenda 2: Course introduction Agenda 3: Some logistics for BIOL 572 Agenda 4: Q&A section Agenda 1: Introduce
More informationString Search. 6th September 2018
String Search 6th September 2018 Search for a given (short) string in a long string Search problems have become more important lately The amount of stored digital information grows steadily (rapidly?)
More informationProcedure to Create NCBI KOGS
Procedure to Create NCBI KOGS full details in: Tatusov et al (2003) BMC Bioinformatics 4:41. 1. Detect and mask typical repetitive domains Reason: masking prevents spurious lumping of non-orthologs based
More information11/24/13. Science, then, and now. Computational Structural Bioinformatics. Learning curve. ECS129 Instructor: Patrice Koehl
Computational Structural Bioinformatics ECS129 Instructor: Patrice Koehl http://www.cs.ucdavis.edu/~koehl/teaching/ecs129/index.html koehl@cs.ucdavis.edu Learning curve Math / CS Biology/ Chemistry Pre-requisite
More information15 Text search. P.D. Dr. Alexander Souza. Winter term 11/12
Algorithms Theory 15 Text search P.D. Dr. Alexander Souza Text search Various scenarios: Dynamic texts Text editors Symbol manipulators Static texts Literature databases Library systems Gene databases
More informationPattern Matching. a b a c a a b. a b a c a b. a b a c a b. Pattern Matching Goodrich, Tamassia
Pattern Matching a b a c a a b 1 4 3 2 Pattern Matching 1 Brute-Force Pattern Matching ( 11.2.1) The brute-force pattern matching algorithm compares the pattern P with the text T for each possible shift
More informationI519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB
I519 Introduction to Bioinformatics, 2011 Genome Comparison Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Whole genome comparison/alignment Build better phylogenies Identify polymorphism
More informationINF 4130 / /8-2017
INF 4130 / 9135 28/8-2017 Algorithms, efficiency, and complexity Problem classes Problems can be divided into sets (classes). Problem classes are defined by the type of algorithm that can (or cannot) solve
More informationINF 4130 / /8-2014
INF 4130 / 9135 26/8-2014 Mandatory assignments («Oblig-1», «-2», and «-3»): All three must be approved Deadlines around: 25. sept, 25. oct, and 15. nov Other courses on similar themes: INF-MAT 3370 INF-MAT
More informationModule 9: Tries and String Matching
Module 9: Tries and String Matching CS 240 - Data Structures and Data Management Sajed Haque Veronika Irvine Taylor Smith Based on lecture notes by many previous cs240 instructors David R. Cheriton School
More informationDictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line
Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VF Files On-line MatBio 18 Solon P. Pissis and Ahmad Retha King s ollege London 02-Aug-2018 Solon P. Pissis and Ahmad Retha
More informationAlgorithm Theory. 13 Text Search - Knuth, Morris, Pratt, Boyer, Moore. Christian Schindelhauer
Algorithm Theory 13 Text Search - Knuth, Morris, Pratt, Boyer, Moore Institut für Informatik Wintersemester 2007/08 Text Search Scenarios Static texts Literature databases Library systems Gene databases
More informationJumbled String Matching: Motivations, Variants, Algorithms
Jumbled String Matching: Motivations, Variants, Algorithms Zsuzsanna Lipták University of Verona (Italy) Workshop Combinatorial structures for sequence analysis in bioinformatics Milano-Bicocca, 27 Nov
More informationSUFFIX TREE. SYNONYMS Compact suffix trie
SUFFIX TREE Maxime Crochemore King s College London and Université Paris-Est, http://www.dcs.kcl.ac.uk/staff/mac/ Thierry Lecroq Université de Rouen, http://monge.univ-mlv.fr/~lecroq SYNONYMS Compact suffix
More informationMulti-Assembly Problems for RNA Transcripts
Multi-Assembly Problems for RNA Transcripts Alexandru Tomescu Department of Computer Science University of Helsinki Joint work with Veli Mäkinen, Anna Kuosmanen, Romeo Rizzi, Travis Gagie, Alex Popa CiE
More informationPattern Matching (Exact Matching) Overview
CSI/BINF 5330 Pattern Matching (Exact Matching) Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Pattern Matching Exhaustive Search DFA Algorithm KMP Algorithm
More informationHow much non-coding DNA do eukaryotes require?
How much non-coding DNA do eukaryotes require? Andrei Zinovyev UMR U900 Computational Systems Biology of Cancer Institute Curie/INSERM/Ecole de Mine Paritech Dr. Sebastian Ahnert Dr. Thomas Fink Bioinformatics
More informationBioinformatics. Dept. of Computational Biology & Bioinformatics
Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS
More informationCycle «Analyse de données de séquençage à haut-débit»
Cycle «Analyse de données de séquençage à haut-débit» Module 1/5 Analyse ADN Chadi Saad CRIStAL - Équipe BONSAI - Univ Lille, CNRS, INRIA (chadi.saad@univ-lille.fr) Présentation de Sophie Gallina (source:
More informationBioinformatics 2. Yeast two hybrid. Proteomics. Proteomics
GENOME Bioinformatics 2 Proteomics protein-gene PROTEOME protein-protein METABOLISM Slide from http://www.nd.edu/~networks/ Citrate Cycle Bio-chemical reactions What is it? Proteomics Reveal protein Protein
More informationApplications of genome alignment
Applications of genome alignment Comparing different genome assemblies Locating genome duplications and conserved segments Gene finding through comparative genomics Analyzing pathogenic bacteria against
More informationSearching Sear ( Sub- (Sub )Strings Ulf Leser
Searching (Sub-)Strings Ulf Leser This Lecture Exact substring search Naïve Boyer-Moore Searching with profiles Sequence profiles Ungapped approximate search Statistical evaluation of search results Ulf
More informationIntroduction to de novo RNA-seq assembly
Introduction to de novo RNA-seq assembly Introduction Ideal day for a molecular biologist Ideal Sequencer Any type of biological material Genetic material with high quality and yield Cutting-Edge Technologies
More informationOverview. Knuth-Morris-Pratt & Boyer-Moore Algorithms. Notation Review (2) Notation Review (1) The Kunth-Morris-Pratt (KMP) Algorithm
Knuth-Morris-Pratt & s by Robert C. St.Pierre Overview Notation review Knuth-Morris-Pratt algorithm Discussion of the Algorithm Example Boyer-Moore algorithm Discussion of the Algorithm Example Applications
More information10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison
10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:
More informationLecture 3: String Matching
COMP36111: Advanced Algorithms I Lecture 3: String Matching Ian Pratt-Hartmann Room KB2.38: email: ipratt@cs.man.ac.uk 2017 18 Outline The string matching problem The Rabin-Karp algorithm The Knuth-Morris-Pratt
More informationDetecting unfolded regions in protein sequences. Anne Poupon Génomique Structurale de la Levure IBBMC Université Paris-Sud / CNRS France
Detecting unfolded regions in protein sequences Anne Poupon Génomique Structurale de la Levure IBBMC Université Paris-Sud / CNRS France Large proteins and complexes: a domain approach Structural studies
More informationRelated Courses He who asks is a fool for five minutes, but he who does not ask remains a fool forever.
CSE 527 Computational Biology http://www.cs.washington.edu/527 Lecture 1: Overview & Bio Review Autumn 2004 Larry Ruzzo Related Courses He who asks is a fool for five minutes, but he who does not ask remains
More informationComputational Biology: Basics & Interesting Problems
Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information
More informationOptimal spaced seeds for faster approximate string matching
Optimal spaced seeds for faster approximate string matching Martin Farach-Colton Gad M. Landau S. Cenk Sahinalp Dekel Tsur Abstract Filtering is a standard technique for fast approximate string matching
More informationOptimal spaced seeds for faster approximate string matching
Optimal spaced seeds for faster approximate string matching Martin Farach-Colton Gad M. Landau S. Cenk Sahinalp Dekel Tsur Abstract Filtering is a standard technique for fast approximate string matching
More informationString Matching Problem
String Matching Problem Pattern P Text T Set of Locations L 9/2/23 CAP/CGS 5991: Lecture 2 Computer Science Fundamentals Specify an input-output description of the problem. Design a conceptual algorithm
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics Jianlin Cheng, PhD Department of Computer Science Informatics Institute 2011 Topics Introduction Biological Sequence Alignment and Database Search Analysis of gene expression
More information"Omics" - Experimental Approachs 11/18/05
"Omics" - Experimental Approachs Bioinformatics Seminars "Omics" Experimental Approaches Nov 18 Fri 12:10 BCB Seminar in E164 Lago Using P-Values for the Planning and Analysis of Microarray Experiments
More informationGenome Assembly. Sequencing Output. High Throughput Sequencing
Genome High Throughput Sequencing Sequencing Output Example applications: Sequencing a genome (DNA) Sequencing a transcriptome and gene expression studies (RNA) ChIP (chromatin immunoprecipitation) Example
More informationGraph Alignment and Biological Networks
Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale
More informationString Regularities and Degenerate Strings
M. Sc. Thesis Defense Md. Faizul Bari (100705050P) Supervisor: Dr. M. Sohel Rahman String Regularities and Degenerate Strings Department of Computer Science and Engineering Bangladesh University of Engineering
More informationCompror: On-line lossless data compression with a factor oracle
Information Processing Letters 83 (2002) 1 6 Compror: On-line lossless data compression with a factor oracle Arnaud Lefebvre a,, Thierry Lecroq b a UMR CNRS 6037 ABISS, Faculté des Sciences et Techniques,
More informationAlgorithms Design & Analysis. String matching
Algorithms Design & Analysis String matching Greedy algorithm Recap 2 Today s topics KM algorithm Suffix tree Approximate string matching 3 String Matching roblem Given a text string T of length n and
More informationBSC 4934: QʼBIC Capstone Workshop" Giri Narasimhan. ECS 254A; Phone: x3748
BSC 4934: QʼBIC Capstone Workshop" Giri Narasimhan ECS 254A; Phone: x3748 giri@cs.fiu.edu http://www.cs.fiu.edu/~giri/teach/bsc4934_su10.html July 2010 7/12/10 Q'BIC Bioinformatics 1 Overview of Course"
More informationIntroduction to Bioinformatics
CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics
More informationAlgorithms in Computational Biology (236522) spring 2008 Lecture #1
Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: 15:30-16:30/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office hours:??
More informationAlgorithmics and Bioinformatics
Algorithmics and Bioinformatics Gregory Kucherov and Philippe Gambette LIGM/CNRS Université Paris-Est Marne-la-Vallée, France Schedule Course webpage: https://wikimpri.dptinfo.ens-cachan.fr/doku.php?id=cours:c-1-32
More informationCSE182-L7. Protein Sequence Analysis Patterns (regular expressions) Profiles HMM Gene Finding CSE182
CSE182-L7 Protein Sequence Analysis Patterns (regular expressions) Profiles HMM Gene Finding 10-07 CSE182 Bell Labs Honors Pattern matching 10-07 CSE182 Just the Facts Consider the set of all substrings
More informationLeast Random Suffix/Prefix Matches in Output-Sensitive Time
Least Random Suffix/Prefix Matches in Output-Sensitive Time Niko Välimäki Department of Computer Science University of Helsinki nvalimak@cs.helsinki.fi 23rd Annual Symposium on Combinatorial Pattern Matching
More informationList of Code Challenges. About the Textbook Meet the Authors... xix Meet the Development Team... xx Acknowledgments... xxi
Contents List of Code Challenges xvii About the Textbook xix Meet the Authors................................... xix Meet the Development Team............................ xx Acknowledgments..................................
More informationG4120: Introduction to Computational Biology
ICB Fall 2009 G4120: Introduction to Computational Biology Oliver Jovanovic, Ph.D. Columbia University Department of Microbiology & Immunology Copyright 2008 Oliver Jovanovic, All Rights Reserved. Genome
More informationUniversità della Calabria
Università della Calabria Facoltà di Ingegneria BIOINFORMATICS TECHNIQUES AND METHODOLOGIES Research group coordinated by Prof. Luigi Palopoli Lecturer: Simona Rombo OUTLINE 1. Introduction to Bioinformatics
More informationLinear-Time Computation of Local Periods
Linear-Time Computation of Local Periods Jean-Pierre Duval 1, Roman Kolpakov 2,, Gregory Kucherov 3, Thierry Lecroq 4, and Arnaud Lefebvre 4 1 LIFAR, Université de Rouen, France Jean-Pierre.Duval@univ-rouen.fr
More informationData Structure for Dynamic Patterns
Data Structure for Dynamic Patterns Chouvalit Khancome and Veera Booning Member IAENG Abstract String matching and dynamic dictionary matching are significant principles in computer science. These principles
More informationModelling and Analysis in Bioinformatics. Lecture 1: Genomic k-mer Statistics
582746 Modelling and Analysis in Bioinformatics Lecture 1: Genomic k-mer Statistics Juha Kärkkäinen 06.09.2016 Outline Course introduction Genomic k-mers 1-Mers 2-Mers 3-Mers k-mers for Larger k Outline
More informationEnsembl Genomes (non-chordates): Quick tour. This quick tour provides a brief introduction to Ensembl Genomes [2], the non-chordate genome browser.
Paul Kersey [1] DNA & RNA Beginner 0.5 hour This quick tour provides a brief introduction to Ensembl Genomes [2], the non-chordate genome browser. Learning objectives: Basic understanding of Ensembl Genomes
More informationComputational Biology From The Perspective Of A Physical Scientist
Computational Biology From The Perspective Of A Physical Scientist Dr. Arthur Dong PP1@TUM 26 November 2013 Bioinformatics Education Curriculum Math, Physics, Computer Science (Statistics and Programming)
More informationIntroduction to Bioinformatics Integrated Science, 11/9/05
1 Introduction to Bioinformatics Integrated Science, 11/9/05 Morris Levy Biological Sciences Research: Evolutionary Ecology, Plant- Fungal Pathogen Interactions Coordinator: BIOL 495S/CS490B/STAT490B Introduction
More informationAll three must be approved Deadlines around: 21. sept, 26. okt, and 16. nov
INF 4130 / 9135 29/8-2012 Today s slides are produced mainly by Petter Kristiansen Lecturer Stein Krogdahl Mandatory assignments («Oblig1», «-2», and «-3»): All three must be approved Deadlines around:
More informationGenomes and Their Evolution
Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from
More informationPaired-End Read Length Lower Bounds for Genome Re-sequencing
1/11 Paired-End Read Length Lower Bounds for Genome Re-sequencing Rayan Chikhi ENS Cachan Brittany PhD student in the Symbiose team, Irisa, France 2/11 NEXT-GENERATION SEQUENCING Next-gen vs. traditional
More informationDrosophila melanogaster and D. simulans, two fruit fly species that are nearly
Comparative Genomics: Human versus chimpanzee 1. Introduction The chimpanzee is the closest living relative to humans. The two species are nearly identical in DNA sequence (>98% identity), yet vastly different
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationTheoretical distribution of PSSM scores
Regulatory Sequence Analysis Theoretical distribution of PSSM scores Jacques van Helden Jacques.van-Helden@univ-amu.fr Aix-Marseille Université, France Technological Advances for Genomics and Clinics (TAGC,
More informationECOL/MCB 320 and 320H Genetics
ECOL/MCB 320 and 320H Genetics Instructors Dr. C. William Birky, Jr. Dept. of Ecology and Evolutionary Biology Lecturing on Molecular genetics Transmission genetics Population and evolutionary genetics
More informationStatistical mass spectrometry-based proteomics
1 Statistical mass spectrometry-based proteomics Olga Vitek www.stat.purdue.edu Outline What is proteomics? Biological questions and technologies Protein quantification in label-free workflows Joint analysis
More informationProteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?
Proteomics What is it? Reveal protein interactions Protein profiling in a sample Yeast two hybrid screening High throughput 2D PAGE Automatic analysis of 2D Page Yeast two hybrid Use two mating strains
More informationCopyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.
Structure Determination and Sequence Analysis The vast majority of the experimentally determined three-dimensional protein structures have been solved by one of two methods: X-ray diffraction and Nuclear
More informationI519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB
I519 Introduction to Bioinformatics, 2015 Genome Comparison Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Whole genome comparison/alignment Build better phylogenies Identify polymorphism
More informationBayesian Clustering of Multi-Omics
Bayesian Clustering of Multi-Omics for Cardiovascular Diseases Nils Strelow 22./23.01.2019 Final Presentation Trends in Bioinformatics WS18/19 Recap Intermediate presentation Precision Medicine Multi-Omics
More informationImplementing Approximate Regularities
Implementing Approximate Regularities Manolis Christodoulakis Costas S. Iliopoulos Department of Computer Science King s College London Kunsoo Park School of Computer Science and Engineering, Seoul National
More informationInvestigation 3: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST
Investigation 3: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST Introduction Bioinformatics is a powerful tool which can be used to determine evolutionary relationships and
More informationProtein function studies: history, current status and future trends
19 3 2007 6 Chinese Bulletin of Life Sciences Vol. 19, No. 3 Jun., 2007 1004-0374(2007)03-0294-07 ( 100871) Q51A Protein function studies: history, current status and future trends MA Jing, GE Xi, CHANG
More informationWhole Genome Alignments and Synteny Maps
Whole Genome Alignments and Synteny Maps IINTRODUCTION It was not until closely related organism genomes have been sequenced that people start to think about aligning genomes and chromosomes instead of
More informationBioinformatics Chapter 1. Introduction
Bioinformatics Chapter 1. Introduction Outline! Biological Data in Digital Symbol Sequences! Genomes Diversity, Size, and Structure! Proteins and Proteomes! On the Information Content of Biological Sequences!
More informationGENOME-WIDE ANALYSIS OF CORE PROMOTER REGIONS IN EMILIANIA HUXLEYI
1 GENOME-WIDE ANALYSIS OF CORE PROMOTER REGIONS IN EMILIANIA HUXLEYI Justin Dailey and Xiaoyu Zhang Department of Computer Science, California State University San Marcos San Marcos, CA 92096 Email: daile005@csusm.edu,
More informationGenomes Comparision via de Bruijn graphs
Genomes Comparision via de Bruijn graphs Student: Ilya Minkin Advisor: Son Pham St. Petersburg Academic University June 4, 2012 1 / 19 Synteny Blocks: Algorithmic challenge Suppose that we are given two
More informationLinear-Space Alignment
Linear-Space Alignment Subsequences and Substrings Definition A string x is a substring of a string x, if x = ux v for some prefix string u and suffix string v (similarly, x = x i x j, for some 1 i j x
More informationProteomics. 2 nd semester, Department of Biotechnology and Bioinformatics Laboratory of Nano-Biotechnology and Artificial Bioengineering
Proteomics 2 nd semester, 2013 1 Text book Principles of Proteomics by R. M. Twyman, BIOS Scientific Publications Other Reference books 1) Proteomics by C. David O Connor and B. David Hames, Scion Publishing
More informationSmall RNA in rice genome
Vol. 45 No. 5 SCIENCE IN CHINA (Series C) October 2002 Small RNA in rice genome WANG Kai ( 1, ZHU Xiaopeng ( 2, ZHONG Lan ( 1,3 & CHEN Runsheng ( 1,2 1. Beijing Genomics Institute/Center of Genomics and
More informationOn the Sound Covering Cycle Problem in Paired de Bruijn Graphs
On the Sound Covering Cycle Problem in Paired de Bruijn Graphs Christian Komusiewicz 1 and Andreea Radulescu 2 1 Institut für Softwaretechnik und Theoretische Informatik, TU Berlin, Germany christian.komusiewicz@tu-berlin.de
More informationAnalysis of Algorithms Prof. Karen Daniels
UMass Lowell Computer Science 91.503 Analysis of Algorithms Prof. Karen Daniels Spring, 2012 Tuesday, 4/24/2012 String Matching Algorithms Chapter 32* * Pseudocode uses 2 nd edition conventions 1 Chapter
More informationRegulatory Sequence Analysis. Sequence models (Bernoulli and Markov models)
Regulatory Sequence Analysis Sequence models (Bernoulli and Markov models) 1 Why do we need random models? Any pattern discovery relies on an underlying model to estimate the random expectation. This model
More informationAdvanced Algorithms and Models for Computational Biology
Advanced Algorithms and Models for Computational Biology Introduction to cell biology genomics development and probability Eric ing Lecture January 3 006 Reading: Chap. DTM book Introduction to cell biology
More informationGenome Sequencing & DNA Sequence Analysis
7.91 / 7.36 / BE.490 Lecture #1 Feb. 24, 2004 Genome Sequencing & DNA Sequence Analysis Chris Burge What is a Genome? A genome is NOT a bag of proteins What s in the Human Genome? Outline of Unit II: DNA/RNA
More informationCAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools
CAP 5510: to : Tools ECS 254A / EC 2474; Phone x3748; Email: giri@cis.fiu.edu My Homepage: http://www.cs.fiu.edu/~giri http://www.cs.fiu.edu/~giri/teach/bioinfs15.html Office ECS 254 (and EC 2474); Phone:
More informationBIOINFORMATICS LAB AP BIOLOGY
BIOINFORMATICS LAB AP BIOLOGY Bioinformatics is the science of collecting and analyzing complex biological data. Bioinformatics combines computer science, statistics and biology to allow scientists to
More informationConsensus Optimizing Both Distance Sum and Radius
Consensus Optimizing Both Distance Sum and Radius Amihood Amir 1, Gad M. Landau 2, Joong Chae Na 3, Heejin Park 4, Kunsoo Park 5, and Jeong Seop Sim 6 1 Bar-Ilan University, 52900 Ramat-Gan, Israel 2 University
More information2. Exact String Matching
2. Exact String Matching Let T = T [0..n) be the text and P = P [0..m) the pattern. We say that P occurs in T at position j if T [j..j + m) = P. Example: P = aine occurs at position 6 in T = karjalainen.
More informationarxiv: v2 [cs.ds] 16 Mar 2015
Longest common substrings with k mismatches Tomas Flouri 1, Emanuele Giaquinta 2, Kassian Kobert 1, and Esko Ukkonen 3 arxiv:1409.1694v2 [cs.ds] 16 Mar 2015 1 Heidelberg Institute for Theoretical Studies,
More informationFaster Algorithms for String Matching with k Mismatches
794 Faster Algorithms for String Matching with k Mismatches Amihood Amir *t Bar-Ilan University and Georgia Tech Moshe Lewenstein* * Bar-Ilan University Ely Porat* Bar Ilan University and Weizmann Institute
More informationNetwork alignment and querying
Network biology minicourse (part 4) Algorithmic challenges in genomics Network alignment and querying Roded Sharan School of Computer Science, Tel Aviv University Multiple Species PPI Data Rapid growth
More informationQuasi-Linear Time Computation of the Abelian Periods of a Word
Quasi-Linear Time Computation of the Abelian Periods of a Word G. Fici 1, T. Lecroq 2, A. Lefebvre 2, É. Prieur-Gaston2, and W. F. Smyth 3 1 Gabriele.Fici@unice.fr, I3S, CNRS & Université Nice Sophia Antipolis,
More information