Parameterized Suffix Arrays for Binary Strings

Size: px
Start display at page:

Download "Parameterized Suffix Arrays for Binary Strings"

Transcription

1 Parameterized Suffix Arrays for Binary Strings Satoshi Deguchi, Fumihito Higashijima, Hideo Bannai, Shunsuke Inenaga and Masayuki Takeda Kyushu University, Japan

2 Contents Background Parameterized matching [Baker 93] Linear time algorithms for constructing parameterized suffix arrays parameterized LCP arrays for inary strings Computational Experiments Summary and Open Prolems

3 Background sort program y student A void sort(int *a, int n) { int p, q, t; sort program y student B void sort(int *p, int n) { int x, y, t; } for (p = 0; p < n; p++) { for (q = n-1; q > p; q--) { if (a[q] < a[q-1]) { t = a[q]; a[q] = a[q-1]; a[q-1] = t; } } } } for (x = 0; x < n; x++) { for (y = n-1; y > x; y--) { if (p[y] < p[y-1]) { t = p[y]; p[y] = p[y-1]; p[y-1] = t; } } }

4 Parameterized pattern matching Parameterized string [Baker 93] A string over the parameter alphaet Π and constant alphaet Σ (Π Σ = φ) Parameterized match we will only consider the case Σ = φ in this talk. Two parameterized strings x,y parameterized match if there exists a ijection f on Π Σ where f(c) = c for c Σ, and f(x) = f(x 1 x n )=f(x 1 ) f(x n )=y if (a[q] < a[q-1]) { t = a[q]; a[q] = a[q-1]; a[q-1] = t; } q y a p if (p[y] < p[y-1]) { t = p[y]; p[y] = p[y-1]; p[y-1] = t; }

5 Parameterized pattern matching [Baker 93] Parameterized matching prolem Given parameterized strings p and t, find all positions i where p and t i t i+ p -1 parameterized match t : xxyzxyyxzyx p : xyzx x x y z x y y x z y x xyzx xzyx (y z,z y) yxzy (x y,y x) yzxy (x y,y z,z x) zxyz (x z,y x,z y) zyxz (x z,z x)

6 Parameterized pattern matching [Baker 93] pv(s)[i] = 0 if s[i] s[j] for any 1 j < i i k if k = max { j s[i] = s[j] 1 j < i } pv( aaaaaaaa) = Proposition [Baker `93] Two parameterized strings s,t parameterized match pv(s) = pv(t) t:xxyzxyyxzyx p:xyzx pv(yzxy) = 0003 pv(xyzx) = 0003

7 Parameterized suffix tree (p-suffix tree) [Baker 93] parameterized string t O( t ) time * construction 1. aa a a pv() of each suffix p-suffix tree O( p + #occ) time * matching * Assuming constant size alphaet

8 Standard suffix trees and arrays Linear time [Weiner 73] aa Linear time [Ko&Aluru 03, Karkkainen&Sanders 03, Kim et al. 03] suffix tree suffix array SA[] LCP array LCP[] Suffix arrays are reported to e superior in terms of speed/memory usage, for many applications a a a a a a a Linear time (simple traversal) Linear time [Kasai et al. 01]

9 P-suffix trees and arrays 0 Linear time [Baker '93] p-suffix tree 0 1 aa Linear time for inary strings [this work] P-suffix array PSA[] P-LCP array PLCP[] Linear time (simple traversal) Linear time for inary strings [this work]

10 This work We consider the p-suffix array and p-lcp array show linear time algorithms for direct construction of p-suffix array and its p-lcp array for inary strings

11 Difficulties of pv A pv of a suffix is not necessarily a suffix of a pv aaaaaaaa pv(aaaaaaaa) = suffix NOT suffix aaaaaaa pv( aaaaaaa) =

12 Difficulties of pv Lexicographic order etween suffixes of two strings is not necessarily preserved, even when they share a common prefix pv( aaaaaaa) = > pv(aaaaaaaa) = suffix suffix pv( aaaaaaa) = > pv( aaaaaaa) = Linear time algorithms for direct construction of standard suffix arrays won't work on the pv array.

13 Direct construction of p-suffix array for inary strings Oservation Monotonically decrease?

14 fw array fw(s)[i] = if s[i] s[j] for any i < j s i k if k = min { j s[i] = s[j] i < j s } aaaaaaaa fw(aaaaaaaa) = suffix suffix aaaaaaa fw( aaaaaaa) = A fw of a suffix is always a suffix of a fw!

15 Direct construction of p-suffix array Lemma for inary strings For any inary string s, the p-suffix array of s is equivalent to the standard suffix array of fw(s) Theorem The p-suffix array for inary string of length n can e constructed directly in O(n) time. (With the reverse order on integers)

16 Linear time LCP array construction Standard suffix arrays: [Kasai et al. 2001] i SA[i] s[sa[i]:n] LCP s [i] aa a aaaa aaa aa aaa aa 1 lcp[5] 2 1 lcp[8] 2 1 Depends on preservation of lexicographic order of suffixes with common prefix won't work for parameterized strings

17 Lexicographic order is not preserved even when PLCP > 0. 3 < 4 12 > 8 Standard LCP algorithm won't work. PLCP : i PSA[i] s[psa[i]:n] pv(s[psa[i]:n]) PLCP s [i] 1 12 a a aaaaa aa aaaaaaa aaaaaa aaaaaaaa a aaa aaaaaaa aaaa aaaaa

18 Linear time parameterized LCP array construction for Binary Strings The LCP array of fw can e used to compute the PLCP array of pv Lemma For any inary string s, PLCP s [i] = min{ LCP fw(s) [i] +fw(s)[psa[i]+l], s PSA[i 1]+1 } LCP of fw + first disagreeing fw element Length of PSA[i-1] th suffix

19 PLCP s [i] = min{lcp fw(s) [i] +fw(s)[psa[i]+l], s PSA[i 1]+1} i PSA[i] 2 5 s[psa[i]:n] PLCP s [i] fw(s[psa[i]:n]) LCP s [i] 1 12 a -1-1 a0 a0 0 a2 a1 a1 a1 5 0 a2 a1 3 LCP of fw + first disagreeing fw element 2 11 a aaaaa aa LCP of 5fw=1 first 2 disagreeing aaaaaaa fw element= aaaaaa aaaaaaaa a aaa aaaaaaa aaaa aaaaa

20 PLCP s [i] = min{lcp fw(s) [i] +fw(s)[psa[i]+l], s PSA[i 1] +1} Length of PSA[i-1]=10 i PSA[i] s[psa[i]:n] th suffix = 3 PLCPs[i] fw(s[psa[i]:n]) LCP s [i] a a a0 a0 a1 0 1 a3 Length of PSA[i-1] th suffix 3 5 aaaaa aa aaaaaaa LCP of 6fw=1 first 4 disagreeing aaaaaafw element= aaaaaaaa a = aaa aaaaaaa aaaa aaaaa

21 i PSA[i] s[psa[i]:n] PLCP s [i] fw(s[psa[i]:n]) LCP s [i] 1 12 a a 1 1 min{ 0+5, 2 } 3 5 aaaaa aa min{ 2+2, 4 } 5 2 aaaaaaa aaaaaa min{ 1 + 3, 9 } 7 1 aaaaaaaa a aaa Theorem 3 aaaaaaa The PLCP array for inary string of length n can e constructed in O(n) time, given its p-suffix array aaaa aaaaa

22 Computational Experiments We compared speed of parameterized pattern matching on random inary strings y: p-suffix array of t matched with pv(p) standard suffix array t, matched with p and p' 2 standard suffix arrays t and t', each matched with p itwise complement In general, for alphaet size k, we can construct k! standard suffix arrays for each permutation of parameterized alphaet construct k! different patterns for each permutation of parameterized alphaet and do standard matching k! times

23 Text length = 100 Running time p-suffix array pattern length

24 Text length = 1000 Running time p-suffix array pattern length

25 Summary Showed linear time algorithm to construct p-suffix arrays and p-lcp arrays directly from inary strings (does not construct p-suffix tree) Showed efficiency of p-suffix arrays through computational experiments Up to 200% faster than using suffix arrays twice

26 Open Prolems Direct construction of p-suffix array for alphaet size > 2. Reverse prolem: Infer the parameterized string whose p-suffix array is a given permutation.

Languages. A language is a set of strings. String: A sequence of letters. Examples: cat, dog, house, Defined over an alphabet:

Languages. A language is a set of strings. String: A sequence of letters. Examples: cat, dog, house, Defined over an alphabet: Languages 1 Languages A language is a set of strings String: A sequence of letters Examples: cat, dog, house, Defined over an alphaet: a,, c,, z 2 Alphaets and Strings We will use small alphaets: Strings

More information

Converting SLP to LZ78 in almost Linear Time

Converting SLP to LZ78 in almost Linear Time CPM 2013 Converting SLP to LZ78 in almost Linear Time Hideo Bannai 1, Paweł Gawrychowski 2, Shunsuke Inenaga 1, Masayuki Takeda 1 1. Kyushu University 2. Max-Planck-Institut für Informatik Recompress SLP

More information

CS 420, Spring 2018 Homework 4 Solutions. 1. Use the Pumping Lemma to show that the following languages are not regular: (a) {0 2n 10 n n 0};

CS 420, Spring 2018 Homework 4 Solutions. 1. Use the Pumping Lemma to show that the following languages are not regular: (a) {0 2n 10 n n 0}; CS 420, Spring 2018 Homework 4 Solutions 1. Use the Pumping Lemma to show that the following languages are not regular: (a) {0 2n 10 n n 0}; Solution: Given p 1, choose s = 0 2p 10 p. Then, s is in the

More information

in Trigonometry Name Section 6.1 Law of Sines Important Vocabulary

in Trigonometry Name Section 6.1 Law of Sines Important Vocabulary Name Chapter 6 Additional Topics in Trigonometry Section 6.1 Law of Sines Objective: In this lesson you learned how to use the Law of Sines to solve oblique triangles and how to find the areas of oblique

More information

A DARK GREY P O N T, with a Switch Tail, and a small Star on the Forehead. Any

A DARK GREY P O N T, with a Switch Tail, and a small Star on the Forehead. Any Y Y Y X X «/ YY Y Y ««Y x ) & \ & & } # Y \#$& / Y Y X» \\ / X X X x & Y Y X «q «z \x» = q Y # % \ & [ & Z \ & { + % ) / / «q zy» / & / / / & x x X / % % ) Y x X Y $ Z % Y Y x x } / % «] «] # z» & Y X»

More information

1 Alphabets and Languages

1 Alphabets and Languages 1 Alphabets and Languages Look at handout 1 (inference rules for sets) and use the rules on some examples like {a} {{a}} {a} {a, b}, {a} {{a}}, {a} {{a}}, {a} {a, b}, a {{a}}, a {a, b}, a {{a}}, a {a,

More information

1.3 Regular Expressions

1.3 Regular Expressions 51 1.3 Regular Expressions These have an important role in descriing patterns in searching for strings in many applications (e.g. awk, grep, Perl,...) All regular expressions of alphaet are 1.Øand are

More information

Inferring Strings from Graphs and Arrays

Inferring Strings from Graphs and Arrays Inferring Strings from Graphs and Arrays Hideo Bannai 1, Shunsuke Inenaga 2, Ayumi Shinohara 2,3, and Masayuki Takeda 2,3 1 Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1

More information

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I Internal Examination 2017-18 B.Tech III Year VI Semester Sub: Theory of Computation (6CS3A) Time: 1 Hour 30 min. Max Marks: 40 Note: Attempt all three

More information

Counting and Verifying Maximal Palindromes

Counting and Verifying Maximal Palindromes Counting and Verifying Maximal Palindromes Tomohiro I 1, Shunsuke Inenaga 2, Hideo Bannai 1, and Masayuki Takeda 1 1 Department of Informatics, Kyushu University 2 Graduate School of Information Science

More information

SUFFIX TREE. SYNONYMS Compact suffix trie

SUFFIX TREE. SYNONYMS Compact suffix trie SUFFIX TREE Maxime Crochemore King s College London and Université Paris-Est, http://www.dcs.kcl.ac.uk/staff/mac/ Thierry Lecroq Université de Rouen, http://monge.univ-mlv.fr/~lecroq SYNONYMS Compact suffix

More information

Faster Online Elastic Degenerate String Matching

Faster Online Elastic Degenerate String Matching Faster Online Elastic Degenerate String Matching Kotaro Aoyama Department of Electrical Engineering and Computer Science, Kyushu University, Japan kotaro.aoyama@inf.kyushu-u.ac.jp Yuto Nakashima Department

More information

Sheet 1-8 Dr. Mostafa Aref Format By : Mostafa Sayed

Sheet 1-8 Dr. Mostafa Aref Format By : Mostafa Sayed Sheet -8 Dr. Mostafa Aref Format By : Mostafa Sayed 09 Introduction Assignment. For = {a, } a) Write 0 strings of the following languages i) All strings with no more than one a,,, a, a, a, a, a, a, a ii)

More information

Space-Efficient Construction Algorithm for Circular Suffix Tree

Space-Efficient Construction Algorithm for Circular Suffix Tree Space-Efficient Construction Algorithm for Circular Suffix Tree Wing-Kai Hon, Tsung-Han Ku, Rahul Shah and Sharma Thankachan CPM2013 1 Outline Preliminaries and Motivation Circular Suffix Tree Our Indexes

More information

Discovering Most Classificatory Patterns for Very Expressive Pattern Classes

Discovering Most Classificatory Patterns for Very Expressive Pattern Classes Discovering Most Classificatory Patterns for Very Expressive Pattern Classes Masayuki Takeda 1,2, Shunsuke Inenaga 1,2, Hideo Bannai 3, Ayumi Shinohara 1,2, and Setsuo Arikawa 1 1 Department of Informatics,

More information

ACCEPTS HUGE FLORAL KEY TO LOWELL. Mrs, Walter Laid to Rest Yesterday

ACCEPTS HUGE FLORAL KEY TO LOWELL. Mrs, Walter Laid to Rest Yesterday $ j < < < > XXX Y 928 23 Y Y 4% Y 6 -- Q 5 9 2 5 Z 48 25 )»-- [ Y Y Y & 4 j q - Y & Y 7 - -- - j \ -2 -- j j -2 - - - - [ - - / - ) ) - - / j Y 72 - ) 85 88 - / X - j ) \ 7 9 Y Y 2 3» - ««> Y 2 5 35 Y

More information

Warm Up answers. 1. x 2. 3x²y 8xy² 3. 7x² - 3x 4. 1 term 5. 2 terms 6. 3 terms

Warm Up answers. 1. x 2. 3x²y 8xy² 3. 7x² - 3x 4. 1 term 5. 2 terms 6. 3 terms Warm Up answers 1. x 2. 3x²y 8xy² 3. 7x² - 3x 4. 1 term 5. 2 terms 6. 3 terms Warm Up Assignment 10/23/14 Section 6.1 Page 315: 2 12 (E) 40 58 (E) 66 Section 6.2 Page 323: 2 12 (E) 16 36 (E) 42 46 (E)

More information

Computing Longest Common Substrings Using Suffix Arrays

Computing Longest Common Substrings Using Suffix Arrays Computing Longest Common Substrings Using Suffix Arrays Maxim A. Babenko, Tatiana A. Starikovskaya Moscow State University Computer Science in Russia, 2008 Maxim A. Babenko, Tatiana A. Starikovskaya (MSU)Computing

More information

A Universal Turing Machine

A Universal Turing Machine A Universal Turing Machine A limitation of Turing Machines: Turing Machines are hardwired they execute only one program Real Computers are re-programmable Solution: Universal Turing Machine Attributes:

More information

Two Posts to Fill On School Board

Two Posts to Fill On School Board Y Y 9 86 4 4 qz 86 x : ( ) z 7 854 Y x 4 z z x x 4 87 88 Y 5 x q x 8 Y 8 x x : 6 ; : 5 x ; 4 ( z ; ( ) ) x ; z 94 ; x 3 3 3 5 94 ; ; ; ; 3 x : 5 89 q ; ; x ; x ; ; x : ; ; ; ; ; ; 87 47% : () : / : 83

More information

CS 341 Homework 16 Languages that Are and Are Not Context-Free

CS 341 Homework 16 Languages that Are and Are Not Context-Free CS 341 Homework 16 Languages that Are and Are Not Context-Free 1. Show that the following languages are context-free. You can do this by writing a context free grammar or a PDA, or you can use the closure

More information

Section 2.1 The Derivative and the Tangent Line Problem

Section 2.1 The Derivative and the Tangent Line Problem Chapter 2 Differentiation Course Number Section 2.1 The Derivative an the Tangent Line Problem Objective: In this lesson you learne how to fin the erivative of a function using the limit efinition an unerstan

More information

A Fully Compressed Pattern Matching Algorithm for Simple Collage Systems

A Fully Compressed Pattern Matching Algorithm for Simple Collage Systems A Fully Compressed Pattern Matching Algorithm for Simple Collage Systems Shunsuke Inenaga 1, Ayumi Shinohara 2,3 and Masayuki Takeda 2,3 1 Department of Computer Science, P.O. Box 26 (Teollisuuskatu 23)

More information

LOWELL WEEKLY JOURNAL.

LOWELL WEEKLY JOURNAL. Y 5 ; ) : Y 3 7 22 2 F $ 7 2 F Q 3 q q 6 2 3 6 2 5 25 2 2 3 $2 25: 75 5 $6 Y q 7 Y Y # \ x Y : { Y Y Y : ( \ _ Y ( ( Y F [ F F ; x Y : ( : G ( ; ( ~ x F G Y ; \ Q ) ( F \ Q / F F \ Y () ( \ G Y ( ) \F

More information

and A L T O S O L O LOWELL, MICHIGAN, THURSDAY, OCTCBER Mrs. Thomas' Young Men Good Bye 66 Long Illness Have Sport in

and A L T O S O L O LOWELL, MICHIGAN, THURSDAY, OCTCBER Mrs. Thomas' Young Men Good Bye 66 Long Illness Have Sport in 5 7 8 x z!! Y! [! 2 &>3 x «882 z 89 q!!! 2 Y 66 Y $ Y 99 6 x x 93 x 7 8 9 x 5$ 4 Y q Q 22 5 3 Z 2 5 > 2 52 2 $ 8» Z >!? «z???? q > + 66 + + ) ( x 4 ~ Y Y»» x ( «/ ] x ! «z x( ) x Y 8! < 6 x x 8 \ 4\

More information

Longest Lyndon Substring After Edit

Longest Lyndon Substring After Edit Longest Lyndon Substring After Edit Yuki Urabe Department of Electrical Engineering and Computer Science, Kyushu University, Japan yuki.urabe@inf.kyushu-u.ac.jp Yuto Nakashima Department of Informatics,

More information

Monday, April 29, 13. Tandem repeats

Monday, April 29, 13. Tandem repeats Tandem repeats Tandem repeats A string w is a tandem repeat (or square) if: A tandem repeat αα is primitive if and only if α is primitive A string w = α k for k = 3,4,... is (by some) called a tandem array

More information

What Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages

What Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages Do Homework 2. What Is a Language? Grammars, Languages, and Machines L Language Grammar Accepts Machine Strings: the Building Blocks of Languages An alphabet is a finite set of symbols: English alphabet:

More information

Breaking a Time-and-Space Barrier in Constructing Full-Text Indices

Breaking a Time-and-Space Barrier in Constructing Full-Text Indices Breaking a Time-and-Space Barrier in Constructing Full-Text Indices Wing-Kai Hon Kunihiko Sadakane Wing-Kin Sung Abstract Suffix trees and suffix arrays are the most prominent full-text indices, and their

More information

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction.

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction. Microsoft Research Asia September 5, 2005 1 2 3 4 Section I What is? Definition is a technique for efficiently recurrence computing by storing partial results. In this slides, I will NOT use too many formal

More information

Sorting suffixes of two-pattern strings

Sorting suffixes of two-pattern strings Sorting suffixes of two-pattern strings Frantisek Franek W. F. Smyth Algorithms Research Group Department of Computing & Software McMaster University Hamilton, Ontario Canada L8S 4L7 April 19, 2004 Abstract

More information

Faster Compact On-Line Lempel-Ziv Factorization

Faster Compact On-Line Lempel-Ziv Factorization Faster Compact On-Line Lempel-Ziv Factorization Jun ichi Yamamoto, Tomohiro I, Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda Department of Informatics, Kyushu University, Nishiku, Fukuoka, Japan

More information

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th CSE 3500 Algorithms and Complexity Fall 2016 Lecture 10: September 29, 2016 Quick sort: Average Run Time In the last lecture we started analyzing the expected run time of quick sort. Let X = k 1, k 2,...,

More information

Program Analysis. Lecture 5. Rayna Dimitrova WS 2016/2017

Program Analysis. Lecture 5. Rayna Dimitrova WS 2016/2017 Program Analysis Lecture 5 Rayna Dimitrova WS 2016/2017 2/21 Recap: Constant propagation analysis Goal: For each program point, determine whether a variale has a constant value whenever an execution reaches

More information

Practice Test III, Math 314, Spring 2016

Practice Test III, Math 314, Spring 2016 Practice Test III, Math 314, Spring 2016 Dr. Holmes April 26, 2016 This is the 2014 test reorganized to be more readable. I like it as a review test. The students who took this test had to do four sections

More information

OWELL WEEKLY JOURNAL

OWELL WEEKLY JOURNAL Y \»< - } Y Y Y & #»»» q ] q»»»>) & - - - } ) x ( - { Y» & ( x - (» & )< - Y X - & Q Q» 3 - x Q Y 6 \Y > Y Y X 3 3-9 33 x - - / - -»- --

More information

Università degli studi di Udine

Università degli studi di Udine Università degli studi di Udine Computing LZ77 in Run-Compressed Space This is a pre print version of the following article: Original Computing LZ77 in Run-Compressed Space / Policriti, Alberto; Prezza,

More information

Section 4.1 Switching Algebra Symmetric Functions

Section 4.1 Switching Algebra Symmetric Functions Section 4.1 Switching Algebra Symmetric Functions Alfredo Benso Politecnico di Torino, Italy Alfredo.benso@polito.it Symmetric Functions A function in which each input variable plays the same role in determining

More information

CSE Theory of Computing: Homework 3 Regexes and DFA/NFAs

CSE Theory of Computing: Homework 3 Regexes and DFA/NFAs CSE 34151 Theory of Computing: Homework 3 Regexes and DFA/NFAs Version 1: Fe. 6, 2018 Instructions Unless otherwise specified, all prolems from the ook are from Version 3. When a prolem in the International

More information

Building a befer mousetrap. Transi-on Graphs. L = { abba } Building a befer mousetrap. L = { abba } L = { abba } 2/16/15

Building a befer mousetrap. Transi-on Graphs. L = { abba } Building a befer mousetrap. L = { abba } L = { abba } 2/16/15 Building a efer mousetrap Let s uild an FA that only accepts the string aa from Σ = { a } Models of Computa-on Lecture #5 (a?er quiz) Chapter 6 Building a efer mousetrap Let s uild an FA that only accepts

More information

ECS120 Fall Discussion Notes. October 25, The midterm is on Thursday, November 2nd during class. (That is next week!)

ECS120 Fall Discussion Notes. October 25, The midterm is on Thursday, November 2nd during class. (That is next week!) ECS120 Fall 2006 Discussion Notes October 25, 2006 Announcements The midterm is on Thursday, November 2nd during class. (That is next week!) Homework 4 Quick Hints Problem 1 Prove that the following languages

More information

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs CSC4510/6510 AUTOMATA 6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs The Pumping Lemma for Context Free Languages One way to prove AnBn is not regular is to use the pumping lemma

More information

Chapter 3. Regular grammars

Chapter 3. Regular grammars Chapter 3 Regular grammars 59 3.1 Introduction Other view of the concept of language: not the formalization of the notion of effective procedure, but set of words satisfying a given set of rules Origin

More information

Advanced Text Indexing Techniques. Johannes Fischer

Advanced Text Indexing Techniques. Johannes Fischer Advanced ext Indexing echniques Johannes Fischer SS 2009 1 Suffix rees, -Arrays and -rays 1.1 Recommended Reading Dan Gusfield: Algorithms on Strings, rees, and Sequences. 1997. ambridge University Press,

More information

Theoretical aspects of ERa, the fastest practical suffix tree construction algorithm

Theoretical aspects of ERa, the fastest practical suffix tree construction algorithm Theoretical aspects of ERa, the fastest practical suffix tree construction algorithm Matevž Jekovec University of Ljubljana Faculty of Computer and Information Science Oct 10, 2013 Text indexing problem

More information

Non-regular languages

Non-regular languages Non-regular languages (Pumping Lemma) Fall 2017 Costas Busch - RPI 1 Non-regular languages { a n n : n 0} { vv R : v{ a, }*} Regular languages a* * c a c( a )* etc... Fall 2017 Costas Busch - RPI 2 How

More information

SORTING SUFFIXES OF TWO-PATTERN STRINGS.

SORTING SUFFIXES OF TWO-PATTERN STRINGS. International Journal of Foundations of Computer Science c World Scientific Publishing Company SORTING SUFFIXES OF TWO-PATTERN STRINGS. FRANTISEK FRANEK and WILLIAM F. SMYTH Algorithms Research Group,

More information

Longest Common Prefixes

Longest Common Prefixes Longest Common Prefixes The standard ordering for strings is the lexicographical order. It is induced by an order over the alphabet. We will use the same symbols (,

More information

Gray Codes and Overlap Cycles for Restricted Weight Words

Gray Codes and Overlap Cycles for Restricted Weight Words Gray Codes and Overlap Cycles for Restricted Weight Words Victoria Horan Air Force Research Laboratory Information Directorate Glenn Hurlbert School of Mathematical and Statistical Sciences Arizona State

More information

Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility

Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility Tao Jiang, Ming Li, Brendan Lucier September 26, 2005 Abstract In this paper we study the Kolmogorov Complexity of a

More information

IOAN ŞERDEAN, DANIEL SITARU

IOAN ŞERDEAN, DANIEL SITARU Romanian Mathematical Magazine Web: http://www.ssmrmh.ro The Author: This article is published with open access. TRIGONOMETRIC SUBSTITUTIONS IN PROBLEM SOLVING PART IOAN ŞERDEAN, DANIEL SITARU Abstract.

More information

CVO103: Programming Languages. Lecture 2 Inductive Definitions (2)

CVO103: Programming Languages. Lecture 2 Inductive Definitions (2) CVO103: Programming Languages Lecture 2 Inductive Definitions (2) Hakjoo Oh 2018 Spring Hakjoo Oh CVO103 2018 Spring, Lecture 2 March 13, 2018 1 / 20 Contents More examples of inductive definitions natural

More information

A Quick Introduction to Vectors, Vector Calculus, and Coordinate Systems

A Quick Introduction to Vectors, Vector Calculus, and Coordinate Systems Copyright 2006, David A. Randall Revised Wed, Feb 06, :36:23 A Quick Introduction to Vectors, Vector Calculus, and Coordinate Systems David A. Randall Department of Atmospheric Science Colorado State University,

More information

DEPARTMENT OF MATHEMATIC EDUCATION MATHEMATIC AND NATURAL SCIENCE FACULTY

DEPARTMENT OF MATHEMATIC EDUCATION MATHEMATIC AND NATURAL SCIENCE FACULTY HANDOUT ABSTRACT ALGEBRA MUSTHOFA DEPARTMENT OF MATHEMATIC EDUCATION MATHEMATIC AND NATURAL SCIENCE FACULTY 2012 BINARY OPERATION We are all familiar with addition and multiplication of two numbers. Both

More information

LOWELL WEEKLY JOURNAL

LOWELL WEEKLY JOURNAL Y -» $ 5 Y 7 Y Y -Y- Q x Q» 75»»/ q } # ]»\ - - $ { Q» / X x»»- 3 q $ 9 ) Y q - 5 5 3 3 3 7 Q q - - Q _»»/Q Y - 9 - - - )- [ X 7» -» - )»? / /? Q Y»» # X Q» - -?» Q ) Q \ Q - - - 3? 7» -? #»»» 7 - / Q

More information

r/lt.i Ml s." ifcr ' W ATI II. The fnncrnl.icniccs of Mr*. John We mil uppn our tcpiiblicnn rcprc Died.

r/lt.i Ml s. ifcr ' W ATI II. The fnncrnl.icniccs of Mr*. John We mil uppn our tcpiiblicnn rcprc Died. $ / / - (\ \ - ) # -/ ( - ( [ & - - - - \ - - ( - - - - & - ( ( / - ( \) Q & - - { Q ( - & - ( & q \ ( - ) Q - - # & - - - & - - - $ - 6 - & # - - - & -- - - - & 9 & q - / \ / - - - -)- - ( - - 9 - - -

More information

CSE 105 Homework 1 Due: Monday October 9, Instructions. should be on each page of the submission.

CSE 105 Homework 1 Due: Monday October 9, Instructions. should be on each page of the submission. CSE 5 Homework Due: Monday October 9, 7 Instructions Upload a single file to Gradescope for each group. should be on each page of the submission. All group members names and PIDs Your assignments in this

More information

Enumeration Schemes for Words Avoiding Permutations

Enumeration Schemes for Words Avoiding Permutations Enumeration Schemes for Words Avoiding Permutations Lara Pudwell November 27, 2007 Abstract The enumeration of permutation classes has been accomplished with a variety of techniques. One wide-reaching

More information

1. Prove: A full m- ary tree with i internal vertices contains n = mi + 1 vertices.

1. Prove: A full m- ary tree with i internal vertices contains n = mi + 1 vertices. 1. Prove: A full m- ary tree with i internal vertices contains n = mi + 1 vertices. Proof: Every vertex, except the root, is the child of an internal vertex. Since there are i internal vertices, each of

More information

{a, b, c} {a, b} {a, c} {b, c} {a}

{a, b, c} {a, b} {a, c} {b, c} {a} Section 4.3 Order Relations A binary relation is an partial order if it transitive and antisymmetric. If R is a partial order over the set S, we also say, S is a partially ordered set or S is a poset.

More information

Chapter-2 BOOLEAN ALGEBRA

Chapter-2 BOOLEAN ALGEBRA Chapter-2 BOOLEAN ALGEBRA Introduction: An algebra that deals with binary number system is called Boolean Algebra. It is very power in designing logic circuits used by the processor of computer system.

More information

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013 Q.2 a. Prove by mathematical induction n 4 4n 2 is divisible by 3 for n 0. Basic step: For n = 0, n 3 n = 0 which is divisible by 3. Induction hypothesis: Let p(n) = n 3 n is divisible by 3. Induction

More information

The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor

The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor Vincent Lefèvre 11th July 2005 Abstract This paper is a complement of the research report The Euclidean division implemented

More information

String Indexing for Patterns with Wildcards

String Indexing for Patterns with Wildcards MASTER S THESIS String Indexing for Patterns with Wildcards Hjalte Wedel Vildhøj and Søren Vind Technical University of Denmark August 8, 2011 Abstract We consider the problem of indexing a string t of

More information

MANY BILLS OF CONCERN TO PUBLIC

MANY BILLS OF CONCERN TO PUBLIC - 6 8 9-6 8 9 6 9 XXX 4 > -? - 8 9 x 4 z ) - -! x - x - - X - - - - - x 00 - - - - - x z - - - x x - x - - - - - ) x - - - - - - 0 > - 000-90 - - 4 0 x 00 - -? z 8 & x - - 8? > 9 - - - - 64 49 9 x - -

More information

CSI Mathematical Induction. Many statements assert that a property of the form P(n) is true for all integers n.

CSI Mathematical Induction. Many statements assert that a property of the form P(n) is true for all integers n. CSI 2101- Mathematical Induction Many statements assert that a property of the form P(n) is true for all integers n. Examples: For every positive integer n: n! n n Every set with n elements, has 2 n Subsets.

More information

A conjecture on the alphabet size needed to produce all correlation classes of pairs of words

A conjecture on the alphabet size needed to produce all correlation classes of pairs of words A conjecture on the alphabet size needed to produce all correlation classes of pairs of words Paul Leopardi Thanks: Jörg Arndt, Michael Barnsley, Richard Brent, Sylvain Forêt, Judy-anne Osborn. Mathematical

More information

Define M to be a binary n by m matrix such that:

Define M to be a binary n by m matrix such that: The Shift-And Method Define M to be a binary n by m matrix such that: M(i,j) = iff the first i characters of P exactly match the i characters of T ending at character j. M(i,j) = iff P[.. i] T[j-i+.. j]

More information

Multiple Context-Free Grammars

Multiple Context-Free Grammars Ogden s Lemma, Multiple Context-Free Grammars, and the Control Language Hierarchy Makoto Kanazawa National Institute of Informatics and SOKENDAI Japan Multiple Context-Free Grammars Introduced by Seki,

More information

Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line

Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VF Files On-line MatBio 18 Solon P. Pissis and Ahmad Retha King s ollege London 02-Aug-2018 Solon P. Pissis and Ahmad Retha

More information

Lecture 6 January 15, 2014

Lecture 6 January 15, 2014 Advanced Graph Algorithms Jan-Apr 2014 Lecture 6 January 15, 2014 Lecturer: Saket Sourah Scrie: Prafullkumar P Tale 1 Overview In the last lecture we defined simple tree decomposition and stated that for

More information

Alternative Algorithms for Lyndon Factorization

Alternative Algorithms for Lyndon Factorization Alternative Algorithms for Lyndon Factorization Suhpal Singh Ghuman 1, Emanuele Giaquinta 2, and Jorma Tarhio 1 1 Department of Computer Science and Engineering, Aalto University P.O.B. 15400, FI-00076

More information

Three Profound Theorems about Mathematical Logic

Three Profound Theorems about Mathematical Logic Power & Limits of Logic Three Profound Theorems about Mathematical Logic Gödel's Completeness Theorem Thm 1, good news: only need to know* a few axioms & rules, to prove all validities. *Theoretically

More information

Automata Theory and Formal Grammars: Lecture 1

Automata Theory and Formal Grammars: Lecture 1 Automata Theory and Formal Grammars: Lecture 1 Sets, Languages, Logic Automata Theory and Formal Grammars: Lecture 1 p.1/72 Sets, Languages, Logic Today Course Overview Administrivia Sets Theory (Review?)

More information

Chapter 2 Boolean Algebra and Logic Gates

Chapter 2 Boolean Algebra and Logic Gates Chapter 2 Boolean Algebra and Logic Gates The most common postulates used to formulate various algebraic structures are: 1. Closure. N={1,2,3,4 }, for any a,b N we obtain a unique c N by the operation

More information

Formal Languages. We ll use the English language as a running example.

Formal Languages. We ll use the English language as a running example. Formal Languages We ll use the English language as a running example. Definitions. A string is a finite set of symbols, where each symbol belongs to an alphabet denoted by Σ. Examples. The set of all strings

More information

CSci 311, Models of Computation Chapter 4 Properties of Regular Languages

CSci 311, Models of Computation Chapter 4 Properties of Regular Languages CSci 311, Models of Computation Chapter 4 Properties of Regular Languages H. Conrad Cunningham 29 December 2015 Contents Introduction................................. 1 4.1 Closure Properties of Regular

More information

CS375: Logic and Theory of Computing

CS375: Logic and Theory of Computing CS375: Logic and Theory of Computing Fuhua (Frank) Cheng Department of Computer Science University of Kentucky 1 Tale of Contents: Week 1: Preliminaries (set alger relations, functions) (read Chapters

More information

Advanced Automata Theory 7 Automatic Functions

Advanced Automata Theory 7 Automatic Functions Advanced Automata Theory 7 Automatic Functions Frank Stephan Department of Computer Science Department of Mathematics National University of Singapore fstephan@comp.nus.edu.sg Advanced Automata Theory

More information

Homework 1/Solutions. Graded Exercises

Homework 1/Solutions. Graded Exercises MTH 310-3 Abstract Algebra I and Number Theory S18 Homework 1/Solutions Graded Exercises Exercise 1. Below are parts of the addition table and parts of the multiplication table of a ring. Complete both

More information

Fast String Kernels. Alexander J. Smola Machine Learning Group, RSISE The Australian National University Canberra, ACT 0200

Fast String Kernels. Alexander J. Smola Machine Learning Group, RSISE The Australian National University Canberra, ACT 0200 Fast String Kernels Alexander J. Smola Machine Learning Group, RSISE The Australian National University Canberra, ACT 0200 Alex.Smola@anu.edu.au joint work with S.V.N. Vishwanathan Slides (soon) available

More information

Online Computation of Abelian Runs

Online Computation of Abelian Runs Online Computation of Abelian Runs Gabriele Fici 1, Thierry Lecroq 2, Arnaud Lefebvre 2, and Élise Prieur-Gaston2 1 Dipartimento di Matematica e Informatica, Università di Palermo, Italy Gabriele.Fici@unipa.it

More information

Outline. 1 Merging. 2 Merge Sort. 3 Complexity of Sorting. 4 Merge Sort and Other Sorts 2 / 10

Outline. 1 Merging. 2 Merge Sort. 3 Complexity of Sorting. 4 Merge Sort and Other Sorts 2 / 10 Merge Sort 1 / 10 Outline 1 Merging 2 Merge Sort 3 Complexity of Sorting 4 Merge Sort and Other Sorts 2 / 10 Merging Merge sort is based on a simple operation known as merging: combining two ordered arrays

More information

COMP Logic for Computer Scientists. Lecture 16

COMP Logic for Computer Scientists. Lecture 16 COMP 1002 Logic for Computer Scientists Lecture 16 5 2 J dmin stuff 2 due Feb 17 th. Midterm March 2 nd. Semester break next week! Puzzle: the barber In a certain village, there is a (male) barber who

More information

Automata Theory for Presburger Arithmetic Logic

Automata Theory for Presburger Arithmetic Logic Automata Theory for Presburger Arithmetic Logic References from Introduction to Automata Theory, Languages & Computation and Constraints in Computational Logic Theory & Application Presented by Masood

More information

CS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds

CS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds CS 6347 Lecture 8 & 9 Lagrange Multipliers & Varitional Bounds General Optimization subject to: min ff 0() R nn ff ii 0, h ii = 0, ii = 1,, mm ii = 1,, pp 2 General Optimization subject to: min ff 0()

More information

Information Theory and Statistics Lecture 2: Source coding

Information Theory and Statistics Lecture 2: Source coding Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection

More information

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,

More information

Burrows-Wheeler Transforms in Linear Time and Linear Bits

Burrows-Wheeler Transforms in Linear Time and Linear Bits Burrows-Wheeler Transforms in Linear Time and Linear Bits Russ Cox (following Hon, Sadakane, Sung, Farach, and others) 18.417 Final Project BWT in Linear Time and Linear Bits Three main parts to the result.

More information

UNIVERSITY OF MICHIGAN UNDERGRADUATE MATH COMPETITION 28 APRIL 7, 2011

UNIVERSITY OF MICHIGAN UNDERGRADUATE MATH COMPETITION 28 APRIL 7, 2011 UNIVERSITY OF MICHIGAN UNDERGRADUATE MATH COMPETITION 28 APRIL 7, 20 Instructions. Write on the front of your blue book your student ID number. Do not write your name anywhere on your blue book. Each question

More information

Preview: Text Indexing

Preview: Text Indexing Simon Gog gog@ira.uka.de - Simon Gog: KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu Text Indexing Motivation Problems Given a text

More information

Succincter text indexing with wildcards

Succincter text indexing with wildcards University of British Columbia CPM 2011 June 27, 2011 Problem overview Problem overview Problem overview Problem overview Problem overview Problem overview Problem overview Problem overview Problem overview

More information

Combinatorial algorithms

Combinatorial algorithms Combinatorial algorithms computing subset rank and unrank, Gray codes, k-element subset rank and unrank, computing permutation rank and unrank Jiří Vyskočil, Radek Mařík 2012 Combinatorial Generation definition:

More information

T k b p M r will so ordered by Ike one who quits squuv. fe2m per year, or year, jo ad vaoce. Pleaie and THE ALTO SOLO

T k b p M r will so ordered by Ike one who quits squuv. fe2m per year, or year, jo ad vaoce. Pleaie and THE ALTO SOLO q q P XXX F Y > F P Y ~ Y P Y P F q > ##- F F - 5 F F?? 5 7? F P P?? - - F - F F - P 7 - F P - F F % P - % % > P F 9 P 86 F F F F F > X7 F?? F P Y? F F F P F F

More information

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak 4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof. What is data compression? Reducing the file size without compromising the quality of the data stored in the

More information

arxiv: v2 [cs.ds] 22 Mar 2018

arxiv: v2 [cs.ds] 22 Mar 2018 Computing longest single-arm-gapped palindromes in a string arxiv:1609.03000v2 [cs.ds] 22 Mar 2018 Shintaro Narisada 1, Diptarama 1, Kazuyuki Narisawa 1, Shunsuke Inenaga 2, and Ayumi Shinohara 1 1 Graduate

More information