Ukkonen's suffix tree construction algorithm
|
|
- Delphia Harris
- 6 years ago
- Views:
Transcription
1 Ukkonen's suffix tree construction algorithm aba$ $ab aba$ $ab a ba $ 3 $ $ab a ba $ $ $ String Algorithms; Nov
2 Motivation Yet another suffix tree construction algorithm... Why?
3 Motivation Yet another suffix tree construction algorithm... Why? An online algorithm, i.e. one we can update with new sequences
4 Sketch of algorithm Recall McCreight s approach: For i = 1.. n+1, build compressed trie of {x[..n]$ i} Ukkonen s approach: For i = 1.. n+1, build compressed trie of {$ i} Compressed trie of all suffixes of prefix x[1..i]$ of x$ A suffix tree except for leaf property
5 McCreight's algorithm x=aba T 1 : {x$[..n] 1 } aba$ 1 T 2 : {x$[..n] 2 } T 3 : {x$[..n] 3 } $a b 2 2 $ab aba$ a ba $ 3 1 $ 1 T 4 : {x$[..n] 4 } 2 $ab a ba $ $ 4 $ 3 1
6 Ukkonen's algorithm x=aba T 1 : {x$[..1]} a 1 T 2 : {x$[..2]} b ab 2 1 T 3 : {x$[..3]} 2 aba ba 1 Note: no node for x[3..3] = a T 4 : {x$[..n]} 2 $ab $ 4 a ba$ $ 1 3
7 Tasks in iteration i In iteration i we must Update each to x[..i+1] Add string (special case of above) Leaf: Inner node: Edge:
8 First attempt... Obvious algorithm: For i=1,...,n+1: for =1,...,i: find append Running time O(n 3 ) Need lots of tricks to get O(n)!
9 Updating leaves for free If we label leaves with (k, ) denoting k to the current i, updating a leaf is automatic Leaf: Inner node: Edge:
10 Updating existing strings is free If x[..i+1] is already in the tree, the update is automatic Leaf: Inner node: Edge:
11 Real operations If we can recognize the free operations, we need only deal with the remaining Leaf: Inner node: Edge:
12 Lemma Let denote suffix of x[1..i] a)if >1 is a leaf node in T i, then so is -1 b)if, from <i, there is a path in T i that begins with a, then there is a path in T i from +1 beginning with a
13 Lemma Let denote suffix of x[1..i] a)if >1 is a leaf node in T i, then so is -1 b)if, Once from an <i, edge, there always is a path an in edge T i that begins with a, then there is a path in T i from +1 beginning with a
14 Lemma Let denote suffix of x[1..i] a)if Once >1 is a leaf, node always in been T i, then a leaf so is -1 b)if, from <i, there is a path in T i that begins with a, then there is a path in T i from +1 beginning with a
15 Proof of lemma a) If >1 is a leaf node in T i, then so is -1 Assume -1 is not a leaf. Then there exists k<-1 such that: x[k..i] x[-1..i] x[k..i] x[-1..i] -1 k Then -1 k k+1 thus: x[k+1..i] x[k+1..i]
16 Proof of lemma b) If, from <i, there is a path in T i that begins with a, then there is a path in T i from +1 beginning with a Assume is followed by a, then there exists k< such that: k a thus: +1 k+1 a Hence +1 is followed by a.
17 Corollary In iteration i, there exist indices L and R such that: All suffixes L are leaves All suffixes R are already in the trie I II III L R
18 Consequence of corollary I and III are free operations I II III L R
19 Updated algorithm Implicitly handling free operations: For i=1,...,n+1: for = L,..., R : find append L in iteration i is the last leaf inserted in iteration i-1 (all smaller indices are already leaves) R in iteration i is the first index where x[..i+1] is already in the trie (all larger indices are already in the trie)
20 Suffixes in II are made into leaves Whenever L < < R, is made a leaf:
21 Suffixes in II are made into leaves Whenever L < < R, is made a leaf: Once is a leaf, it will be in I and never in II again
22 Time to go from II to I We handle in II or implicitly in III time 2n: 2 2 Index in II n+1 Path length is 2n Iterations n+1
23 Updated running time Only 2n of these: For i=1,...,n+1: for = L,..., R : find append Constant time Running time is 2n*T(find )
24 Updated running time Only 2n of these: For i=1,...,n+1: for = L,..., R : find append Constant time? Constant time Running time is 2n*T(find ) We ust have to deal with T(find ) in O(1) No worries!
25 Using fastscan and s(-) When searching for, it is already in the trie We can use fastscan for the search T(find ) in O(d) where d is the (node-)depth of If we keep suffix links, s(-), in the tree we can use these as shortcuts
26 Suffix links Invariant: All inner nodes have suffix links
27 Suffix links Invariant: All inner nodes have suffix links Ensuring the invariant: We only insert inner nodes when adding leaves Whenever we insert a new node, for some <i, we also find or insert x[+1..i], and can update s() := x[+1..i] If we insert x[i..i], then s(x[i..i]) :=
28 Finding x[+1..i] from parent() s s(parent()) x[+1..i] Starting from here (initial is L and we can keep a pointer to that node between iterations) Using fastscan here
29 Bound on fastscan Time usage by fastscan is bounded by n for the maximal (node-)depth in the trie plus total decrease of (node-)depth Decrease in depth: Moving to parent(): 1 Moving to s(parent()): max 1 Restarting at L :?
30 Bound on fastscan Time usage by fastscan is bounded by n for the maximal (node-)depth in the trie plus total decrease of (node-)depth Decrease in depth: Moving to parent(): 1 Moving to s(parent()): max 1 Restarting at L :? parent() s s(parent()) x[+1..i] Using fastscan here
31 Hacking the suffix link When searching for x[ L +1..i], update s(x[ L..i]) to point to the nearest ancestor of x[ L +1..i] parent( L ) s s(parent( L )) L x[ L +1..i] Nearest ancestor of x[ L +1..i]
32 Hacking the suffix link When searching for x[ L +1..i], update s(x[ L..i]) to point to the nearest ancestor of x[ L +1..i] parent( L ) s s(parent( L )) L x[ L +1..i] Restart in constant time Nearest ancestor of x[ L +1..i]
33 Iterations Searching time Vertical steps are paid for by the previous horizontal step (free restarting) Horizontal steps are total fastscan bounded by O(n) Runtime O(n) Index in II 2 2 n+1 n+1
34 Summary We have seen a new suffix tree construction algorithm Ukkonen s algorithm is an online algorithm: As long as no suffix is a prefix of another, the intermediate trees are suffix trees Generalized suffix trees can be built one string at a time
McCreight's suffix tree construction algorithm
McCreight's suffix tree construction algorithm b 2 $baa $ 5 $ $ba 6 3 a b 4 $ aab$ 1 Motivation Recall: the suffix tree is an extremely useful data structure with space usage and construction time in O(n).
More informationSUFFIX TREE. SYNONYMS Compact suffix trie
SUFFIX TREE Maxime Crochemore King s College London and Université Paris-Est, http://www.dcs.kcl.ac.uk/staff/mac/ Thierry Lecroq Université de Rouen, http://monge.univ-mlv.fr/~lecroq SYNONYMS Compact suffix
More informationEnhancing Active Automata Learning by a User Log Based Metric
Master Thesis Computing Science Radboud University Enhancing Active Automata Learning by a User Log Based Metric Author Petra van den Bos First Supervisor prof. dr. Frits W. Vaandrager Second Supervisor
More informationSpace-Efficient Construction Algorithm for Circular Suffix Tree
Space-Efficient Construction Algorithm for Circular Suffix Tree Wing-Kai Hon, Tsung-Han Ku, Rahul Shah and Sharma Thankachan CPM2013 1 Outline Preliminaries and Motivation Circular Suffix Tree Our Indexes
More informationFast String Kernels. Alexander J. Smola Machine Learning Group, RSISE The Australian National University Canberra, ACT 0200
Fast String Kernels Alexander J. Smola Machine Learning Group, RSISE The Australian National University Canberra, ACT 0200 Alex.Smola@anu.edu.au joint work with S.V.N. Vishwanathan Slides (soon) available
More informationConverting SLP to LZ78 in almost Linear Time
CPM 2013 Converting SLP to LZ78 in almost Linear Time Hideo Bannai 1, Paweł Gawrychowski 2, Shunsuke Inenaga 1, Masayuki Takeda 1 1. Kyushu University 2. Max-Planck-Institut für Informatik Recompress SLP
More informationAnalysis of tree edit distance algorithms
Analysis of tree edit distance algorithms Serge Dulucq 1 and Hélène Touzet 1 LaBRI - Universit Bordeaux I 33 0 Talence cedex, France Serge.Dulucq@labri.fr LIFL - Universit Lille 1 9 6 Villeneuve d Ascq
More informationCS4311 Design and Analysis of Algorithms. Analysis of Union-By-Rank with Path Compression
CS4311 Design and Analysis of Algorithms Tutorial: Analysis of Union-By-Rank with Path Compression 1 About this lecture Introduce an Ackermann-like function which grows even faster than Ackermann Use it
More informationString Indexing for Patterns with Wildcards
MASTER S THESIS String Indexing for Patterns with Wildcards Hjalte Wedel Vildhøj and Søren Vind Technical University of Denmark August 8, 2011 Abstract We consider the problem of indexing a string t of
More informationInformation Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes
Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length
More informationarxiv: v1 [cs.ds] 9 Apr 2018
From Regular Expression Matching to Parsing Philip Bille Technical University of Denmark phbi@dtu.dk Inge Li Gørtz Technical University of Denmark inge@dtu.dk arxiv:1804.02906v1 [cs.ds] 9 Apr 2018 Abstract
More informationMultiple Choice Tries and Distributed Hash Tables
Multiple Choice Tries and Distributed Hash Tables Luc Devroye and Gabor Lugosi and Gahyun Park and W. Szpankowski January 3, 2007 McGill University, Montreal, Canada U. Pompeu Fabra, Barcelona, Spain U.
More informationInteger Sorting on the word-ram
Integer Sorting on the word-rm Uri Zwick Tel viv University May 2015 Last updated: June 30, 2015 Integer sorting Memory is composed of w-bit words. rithmetical, logical and shift operations on w-bit words
More informationOnline Sorted Range Reporting and Approximating the Mode
Online Sorted Range Reporting and Approximating the Mode Mark Greve Progress Report Department of Computer Science Aarhus University Denmark January 4, 2010 Supervisor: Gerth Stølting Brodal Online Sorted
More informationComputing Connected Components Given a graph G = (V; E) compute the connected components of G. Connected-Components(G) 1 for each vertex v 2 V [G] 2 d
Data Structures for Disjoint Sets Maintain a Dynamic collection of disjoint sets. Each set has a unique representative (an arbitrary member of the set). x. Make-Set(x) - Create a new set with one member
More informationInformation Theory and Statistics Lecture 2: Source coding
Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection
More informationOutline. Complexity Theory. Example. Sketch of a log-space TM for palindromes. Log-space computations. Example VU , SS 2018
Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 3. Logarithmic Space Reinhard Pichler Institute of Logic and Computation DBAI Group TU Wien 3. Logarithmic Space 3.1 Computational
More informationSIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding
SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.
More informationDisjoint-Set Forests
Disjoint-Set Forests Thanks for Showing Up! Outline for Today Incremental Connectivity Maintaining connectivity as edges are added to a graph. Disjoint-Set Forests A simple data structure for incremental
More informationCS 161: Design and Analysis of Algorithms
CS 161: Design and Analysis of Algorithms Greedy Algorithms 3: Minimum Spanning Trees/Scheduling Disjoint Sets, continued Analysis of Kruskal s Algorithm Interval Scheduling Disjoint Sets, Continued Each
More informationAsymptotic redundancy and prolixity
Asymptotic redundancy and prolixity Yuval Dagan, Yuval Filmus, and Shay Moran April 6, 2017 Abstract Gallager (1978) considered the worst-case redundancy of Huffman codes as the maximum probability tends
More informationCompressed Index for Dynamic Text
Compressed Index for Dynamic Text Wing-Kai Hon Tak-Wah Lam Kunihiko Sadakane Wing-Kin Sung Siu-Ming Yiu Abstract This paper investigates how to index a text which is subject to updates. The best solution
More informationCOS597D: Information Theory in Computer Science October 19, Lecture 10
COS597D: Information Theory in Computer Science October 9, 20 Lecture 0 Lecturer: Mark Braverman Scribe: Andrej Risteski Kolmogorov Complexity In the previous lectures, we became acquainted with the concept
More informationCS281A/Stat241A Lecture 19
CS281A/Stat241A Lecture 19 p. 1/4 CS281A/Stat241A Lecture 19 Junction Tree Algorithm Peter Bartlett CS281A/Stat241A Lecture 19 p. 2/4 Announcements My office hours: Tuesday Nov 3 (today), 1-2pm, in 723
More informationCardinality Networks: a Theoretical and Empirical Study
Constraints manuscript No. (will be inserted by the editor) Cardinality Networks: a Theoretical and Empirical Study Roberto Asín, Robert Nieuwenhuis, Albert Oliveras, Enric Rodríguez-Carbonell Received:
More informationMonday, April 29, 13. Tandem repeats
Tandem repeats Tandem repeats A string w is a tandem repeat (or square) if: A tandem repeat αα is primitive if and only if α is primitive A string w = α k for k = 3,4,... is (by some) called a tandem array
More informationAverage Case Analysis of QuickSort and Insertion Tree Height using Incompressibility
Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility Tao Jiang, Ming Li, Brendan Lucier September 26, 2005 Abstract In this paper we study the Kolmogorov Complexity of a
More informationarxiv: v2 [cs.ds] 8 Apr 2016
Optimal Dynamic Strings Paweł Gawrychowski 1, Adam Karczmarz 1, Tomasz Kociumaka 1, Jakub Łącki 2, and Piotr Sankowski 1 1 Institute of Informatics, University of Warsaw, Poland [gawry,a.karczmarz,kociumaka,sank]@mimuw.edu.pl
More information25. Minimum Spanning Trees
695 25. Minimum Spanning Trees Motivation, Greedy, Algorithm Kruskal, General Rules, ADT Union-Find, Algorithm Jarnik, Prim, Dijkstra, Fibonacci Heaps [Ottman/Widmayer, Kap. 9.6, 6.2, 6.1, Cormen et al,
More information25. Minimum Spanning Trees
Problem Given: Undirected, weighted, connected graph G = (V, E, c). 5. Minimum Spanning Trees Motivation, Greedy, Algorithm Kruskal, General Rules, ADT Union-Find, Algorithm Jarnik, Prim, Dijkstra, Fibonacci
More informationAn on{line algorithm is presented for constructing the sux tree for a. the desirable property of processing the string symbol by symbol from left to
(To appear in ALGORITHMICA) On{line construction of sux trees 1 Esko Ukkonen Department of Computer Science, University of Helsinki, P. O. Box 26 (Teollisuuskatu 23), FIN{00014 University of Helsinki,
More informationDynamic Ordered Sets with Exponential Search Trees
Dynamic Ordered Sets with Exponential Search Trees Arne Andersson Computing Science Department Information Technology, Uppsala University Box 311, SE - 751 05 Uppsala, Sweden arnea@csd.uu.se http://www.csd.uu.se/
More informationCMPT 710/407 - Complexity Theory Lecture 4: Complexity Classes, Completeness, Linear Speedup, and Hierarchy Theorems
CMPT 710/407 - Complexity Theory Lecture 4: Complexity Classes, Completeness, Linear Speedup, and Hierarchy Theorems Valentine Kabanets September 13, 2007 1 Complexity Classes Unless explicitly stated,
More informationOnline Learning Summer School Copenhagen 2015 Lecture 1
Online Learning Summer School Copenhagen 2015 Lecture 1 Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem Online Learning Shai Shalev-Shwartz (Hebrew U) OLSS Lecture
More informationOutline. Approximation: Theory and Algorithms. Application Scenario. 3 The q-gram Distance. Nikolaus Augsten. Definition and Properties
Outline Approximation: Theory and Algorithms Nikolaus Augsten Free University of Bozen-Bolzano Faculty of Computer Science DIS Unit 3 March 13, 2009 2 3 Nikolaus Augsten (DIS) Approximation: Theory and
More informationDictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line
Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VF Files On-line MatBio 18 Solon P. Pissis and Ahmad Retha King s ollege London 02-Aug-2018 Solon P. Pissis and Ahmad Retha
More informationFault-Tolerant Consensus
Fault-Tolerant Consensus CS556 - Panagiota Fatourou 1 Assumptions Consensus Denote by f the maximum number of processes that may fail. We call the system f-resilient Description of the Problem Each process
More informationChapter 3 Deterministic planning
Chapter 3 Deterministic planning In this chapter we describe a number of algorithms for solving the historically most important and most basic type of planning problem. Two rather strong simplifying assumptions
More informationCS 584 Data Mining. Association Rule Mining 2
CS 584 Data Mining Association Rule Mining 2 Recall from last time: Frequent Itemset Generation Strategies Reduce the number of candidates (M) Complete search: M=2 d Use pruning techniques to reduce M
More informationImproved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts
Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts Philip Bille 1, Rolf Fagerberg 2, and Inge Li Gørtz 3 1 IT University of Copenhagen. Rued Langgaards
More information9 Union Find. 9 Union Find. List Implementation. 9 Union Find. Union Find Data Structure P: Maintains a partition of disjoint sets over elements.
Union Find Data Structure P: Maintains a partition of disjoint sets over elements. P. makeset(x): Given an element x, adds x to the data-structure and creates a singleton set that contains only this element.
More informationLearning Regular Sets
Learning Regular Sets Author: Dana Angluin Presented by: M. Andreína Francisco Department of Computer Science Uppsala University February 3, 2014 Minimally Adequate Teachers A Minimally Adequate Teacher
More information2. A tableau algorithm for ALC with TBoxes, number restrictions, and inverse roles. Extend ALC-tableau algorithm from first session with
2. A tableau algorithm for ALC with TBoxes, number restrictions, and inverse roles Extend ALC-tableau algorithm from first session with 1 general TBoxes 2 inverse roles 3 number restrictions Goal: Design
More informationk-distinct In- and Out-Branchings in Digraphs
k-distinct In- and Out-Branchings in Digraphs Gregory Gutin 1, Felix Reidl 2, and Magnus Wahlström 1 arxiv:1612.03607v2 [cs.ds] 21 Apr 2017 1 Royal Holloway, University of London, UK 2 North Carolina State
More informationLecture 18 April 26, 2012
6.851: Advanced Data Structures Spring 2012 Prof. Erik Demaine Lecture 18 April 26, 2012 1 Overview In the last lecture we introduced the concept of implicit, succinct, and compact data structures, and
More informationSkylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland
Yufei Tao ITEE University of Queensland Today we will discuss problems closely related to the topic of multi-criteria optimization, where one aims to identify objects that strike a good balance often optimal
More informationFoundations of
91.304 Foundations of (Theoretical) Computer Science Chapter 3 Lecture Notes (Section 3.2: Variants of Turing Machines) David Martin dm@cs.uml.edu With some modifications by Prof. Karen Daniels, Fall 2012
More informationDynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction.
Microsoft Research Asia September 5, 2005 1 2 3 4 Section I What is? Definition is a technique for efficiently recurrence computing by storing partial results. In this slides, I will NOT use too many formal
More informationSmall-Space Dictionary Matching (Dissertation Proposal)
Small-Space Dictionary Matching (Dissertation Proposal) Graduate Center of CUNY 1/24/2012 Problem Definition Dictionary Matching Input: Dictionary D = P 1,P 2,...,P d containing d patterns. Text T of length
More informationExamples of Regular Expressions. Finite Automata vs. Regular Expressions. Example of Using flex. Application
Examples of Regular Expressions 1. 0 10, L(0 10 ) = {w w contains exactly a single 1} 2. Σ 1Σ, L(Σ 1Σ ) = {w w contains at least one 1} 3. Σ 001Σ, L(Σ 001Σ ) = {w w contains the string 001 as a substring}
More informationCS1800: Mathematical Induction. Professor Kevin Gold
CS1800: Mathematical Induction Professor Kevin Gold Induction: Used to Prove Patterns Just Keep Going For an algorithm, we may want to prove that it just keeps working, no matter how big the input size
More informationCSE 421 Greedy: Huffman Codes
CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits better: 2.52 bits/char 74%*2 +26%*4:
More informationExact and Approximate Equilibria for Optimal Group Network Formation
Exact and Approximate Equilibria for Optimal Group Network Formation Elliot Anshelevich and Bugra Caskurlu Computer Science Department, RPI, 110 8th Street, Troy, NY 12180 {eanshel,caskub}@cs.rpi.edu Abstract.
More informationLecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code
Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias
More informationDictionary: an abstract data type
2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees
More informationLABELED CAUSETS IN DISCRETE QUANTUM GRAVITY
LABELED CAUSETS IN DISCRETE QUANTUM GRAVITY S. Gudder Department of Mathematics University of Denver Denver, Colorado 80208, U.S.A. sgudder@du.edu Abstract We point out that labeled causets have a much
More informationOn-Line Construction of Suffix Trees I. E. Ukkonen z
.. t Algorithmica (1995) 14:249-260 Algorithmica 9 1995 Springer-Verlag New York Inc. On-Line Construction of Suffix Trees I E. Ukkonen z Abstract. An on-line algorithm is presented for constructing the
More informationComputing Techniques for Parallel and Distributed Systems with an Application to Data Compression. Sergio De Agostino Sapienza University di Rome
Computing Techniques for Parallel and Distributed Systems with an Application to Data Compression Sergio De Agostino Sapienza University di Rome Parallel Systems A parallel random access machine (PRAM)
More information10-704: Information Processing and Learning Fall Lecture 10: Oct 3
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationTheory of Computation
Theory of Computation (Feodor F. Dragan) Department of Computer Science Kent State University Spring, 2018 Theory of Computation, Feodor F. Dragan, Kent State University 1 Before we go into details, what
More informationThe null-pointers in a binary search tree are replaced by pointers to special null-vertices, that do not carry any object-data
Definition 1 A red black tree is a balanced binary search tree in which each internal node has two children. Each internal node has a color, such that 1. The root is black. 2. All leaf nodes are black.
More informationLinear-Time Algorithms for Finding Tucker Submatrices and Lekkerkerker-Boland Subgraphs
Linear-Time Algorithms for Finding Tucker Submatrices and Lekkerkerker-Boland Subgraphs Nathan Lindzey, Ross M. McConnell Colorado State University, Fort Collins CO 80521, USA Abstract. Tucker characterized
More informationarxiv: v2 [cs.ds] 16 Mar 2015
Longest common substrings with k mismatches Tomas Flouri 1, Emanuele Giaquinta 2, Kassian Kobert 1, and Esko Ukkonen 3 arxiv:1409.1694v2 [cs.ds] 16 Mar 2015 1 Heidelberg Institute for Theoretical Studies,
More informationChapter 8: Recursion. March 10, 2008
Chapter 8: Recursion March 10, 2008 Outline 1 8.1 Recursively Defined Sequences 2 8.2 Solving Recurrence Relations by Iteration 3 8.4 General Recursive Definitions Recursively Defined Sequences As mentioned
More informationDictionary: an abstract data type
2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees
More informationTemporal logics and explicit-state model checking. Pierre Wolper Université de Liège
Temporal logics and explicit-state model checking Pierre Wolper Université de Liège 1 Topics to be covered Introducing explicit-state model checking Finite automata on infinite words Temporal Logics and
More informationPattern Matching (Exact Matching) Overview
CSI/BINF 5330 Pattern Matching (Exact Matching) Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Pattern Matching Exhaustive Search DFA Algorithm KMP Algorithm
More informationLecture notes for Advanced Graph Algorithms : Verification of Minimum Spanning Trees
Lecture notes for Advanced Graph Algorithms : Verification of Minimum Spanning Trees Lecturer: Uri Zwick November 18, 2009 Abstract We present a deterministic linear time algorithm for the Tree Path Maxima
More informationAn O(N) Semi-Predictive Universal Encoder via the BWT
An O(N) Semi-Predictive Universal Encoder via the BWT Dror Baron and Yoram Bresler Abstract We provide an O(N) algorithm for a non-sequential semi-predictive encoder whose pointwise redundancy with respect
More informationQuadratic Expressions and Equations
Unit 5 Quadratic Expressions and Equations 1/9/2017 2/8/2017 Name: By the end of this unit, you will be able to Add, subtract, and multiply polynomials Solve equations involving the products of monomials
More informationA Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister
A Syntax-based Statistical Machine Translation Model Alexander Friedl, Georg Teichtmeister 4.12.2006 Introduction The model Experiment Conclusion Statistical Translation Model (STM): - mathematical model
More informationMETA: An Efficient Matching-Based Method for Error-Tolerant Autocompletion
: An Efficient Matching-Based Method for Error-Tolerant Autocompletion Dong Deng Guoliang Li He Wen H. V. Jagadish Jianhua Feng Department of Computer Science, Tsinghua National Laboratory for Information
More informationJumping Sequences. Steve Butler 1. (joint work with Ron Graham and Nan Zang) University of California Los Angelese
Jumping Sequences Steve Butler 1 (joint work with Ron Graham and Nan Zang) 1 Department of Mathematics University of California Los Angelese www.math.ucla.edu/~butler UCSD Combinatorics Seminar 14 October
More informationSelf-Stabilizing Silent Disjunction in an Anonymous Network
Self-Stabilizing Silent Disjunction in an Anonymous Network Ajoy K. Datta 1, Stéphane Devismes 2, and Lawrence L. Larmore 1 1 Department of Computer Science, University of Nevada Las Vegas, USA 2 VERIMAG
More informationProblem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26
Binary Search Introduction Problem Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26 Strategy 1: Random Search Randomly select a page until the page containing
More informationTheoretical Computer Science
Theoretical Computer Science 410 (2009) 2759 2766 Contents lists available at ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Note Computing the longest topological
More informationAlgorithms Design & Analysis. String matching
Algorithms Design & Analysis String matching Greedy algorithm Recap 2 Today s topics KM algorithm Suffix tree Approximate string matching 3 String Matching roblem Given a text string T of length n and
More informationPremaster Course Algorithms 1 Chapter 3: Elementary Data Structures
Premaster Course Algorithms 1 Chapter 3: Elementary Data Structures Christian Scheideler SS 2018 23.04.2018 Chapter 3 1 Overview Basic data structures Search structures (successor searching) Dictionaries
More informationHierarchical Overlap Graph
Hierarchical Overlap Graph B. Cazaux and E. Rivals LIRMM & IBC, Montpellier 8. Feb. 2018 arxiv:1802.04632 2018 B. Cazaux & E. Rivals 1 / 29 Overlap Graph for a set of words Consider the set P := {abaa,
More informationNew Bounds and Extended Relations Between Prefix Arrays, Border Arrays, Undirected Graphs, and Indeterminate Strings
New Bounds and Extended Relations Between Prefix Arrays, Border Arrays, Undirected Graphs, and Indeterminate Strings F. Blanchet-Sadri Michelle Bodnar Benjamin De Winkle 3 November 6, 4 Abstract We extend
More information4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we
More informationNotes on Learning Probabilistic Automata
Purdue University Purdue e-pubs Department of Computer Science Technical Reports Department of Computer Science 1999 Notes on Learning Probabilistic Automata Alberto Apostolico Report Number: 99-028 Apostolico,
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More information21. Dynamic Programming III. FPTAS [Ottman/Widmayer, Kap. 7.2, 7.3, Cormen et al, Kap. 15,35.5]
575 21. Dynamic Programming III FPTAS [Ottman/Widmayer, Kap. 7.2, 7.3, Cormen et al, Kap. 15,35.5] Approximation 576 Let ε (0, 1) given. Let I opt an optimal selection. No try to find a valid selection
More informationA practical introduction to active automata learning
A practical introduction to active automata learning Bernhard Steffen, Falk Howar, Maik Merten TU Dortmund SFM2011 Maik Merten, learning technology 1 Overview Motivation Introduction to active automata
More informationBreadth-First Search of Graphs
Breadth-First Search of Graphs Analysis of Algorithms Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Applications of Breadth-First Search of Graphs a) Single Source
More informationInference and Representation
Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of
More informationNotes for Comp 497 (Comp 454) Week 10 4/5/05
Notes for Comp 497 (Comp 454) Week 10 4/5/05 Today look at the last two chapters in Part II. Cohen presents some results concerning context-free languages (CFL) and regular languages (RL) also some decidability
More informationON THE BIT-COMPLEXITY OF LEMPEL-ZIV COMPRESSION
ON THE BIT-COMPLEXITY OF LEMPEL-ZIV COMPRESSION PAOLO FERRAGINA, IGOR NITTO, AND ROSSANO VENTURINI Abstract. One of the most famous and investigated lossless data-compression schemes is the one introduced
More informationOptimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs
Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs Krishnendu Chatterjee Rasmus Ibsen-Jensen Andreas Pavlogiannis IST Austria Abstract. We consider graphs with n nodes together
More informationA Dichotomy for Regular Expression Membership Testing
58th Annual IEEE Symposium on Foundations of Computer Science A Dichotomy for Regular Expression Membership Testing Karl Bringmann Max Planck Institute for Informatics Saarbrücken Informatics Campus Saarbrücken,
More informationLecture 4 : Adaptive source coding algorithms
Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv
More informationHENNING FERNAU Fachbereich IV, Abteilung Informatik, Universität Trier, D Trier, Germany
International Journal of Foundations of Computer Science c World Scientific Publishing Company PROGRAMMMED GRAMMARS WITH RULE QUEUES HENNING FERNAU Fachbereich IV, Abteilung Informatik, Universität Trier,
More informationThe Impossibility of Certain Types of Carmichael Numbers
The Impossibility of Certain Types of Carmichael Numbers Thomas Wright Abstract This paper proves that if a Carmichael number is composed of primes p i, then the LCM of the p i 1 s can never be of the
More informationA Tight Lower Bound for Determinization of Transition Labeled Büchi Automata
A Tight Lower Bound for Determinization of Transition Labeled Büchi Automata Thomas Colcombet, Konrad Zdanowski CNRS JAF28, Fontainebleau June 18, 2009 Finite Automata A finite automaton is a tuple A =
More informationOn weakly intersecting pairs of sets
Zoltán Király 1 Zoltán L. Nagy 1 Dömötör Pálvölgyi 1,2 Mirkó Visontai 3 1 Eötvös University Budapest 2 Ecole Polytechnique Fédérale de Lausanne 3 University of Pennsylvania July 7, 2010/LPCA Outline Intersecting
More informationSkriptum VL Text Indexing Sommersemester 2012 Johannes Fischer (KIT)
Skriptum VL Text Indexing Sommersemester 2012 Johannes Fischer (KIT) Disclaimer Students attending my lectures are often astonished that I present the material in a much livelier form than in this script.
More informationChapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn
Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn Priority Queue / Heap Stores (key,data) pairs (like dictionary) But, different set of operations: Initialize-Heap: creates new empty heap
More informationLower Bounds for Dynamic Connectivity (2004; Pǎtraşcu, Demaine)
Lower Bounds for Dynamic Connectivity (2004; Pǎtraşcu, Demaine) Mihai Pǎtraşcu, MIT, web.mit.edu/ mip/www/ Index terms: partial-sums problem, prefix sums, dynamic lower bounds Synonyms: dynamic trees 1
More informationINF 4130 / /8-2014
INF 4130 / 9135 26/8-2014 Mandatory assignments («Oblig-1», «-2», and «-3»): All three must be approved Deadlines around: 25. sept, 25. oct, and 15. nov Other courses on similar themes: INF-MAT 3370 INF-MAT
More information