Analysis of Algorithms Prof. Karen Daniels

Size: px
Start display at page:

Download "Analysis of Algorithms Prof. Karen Daniels"

Transcription

1 UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Spring, 2012 Tuesday, 4/24/2012 String Matching Algorithms Chapter 32* * Pseudocode uses 2 nd edition conventions 1

2 Chapter Dependencies Automata Ch 32 String Matching You re responsible for material in Sections of this chapter. 2

3 String Matching Algorithms Motivation & Basics 3

4 String Matching Problem Motivations: text-editing, editing, pattern matching in DNA sequences 32.1 Text: : array T [1...n] n m Pattern: : array P [1...m] Array Element: : Character from finite alphabet Σ Pattern P occurs with shift s in T if P [1...m] = T [s +1...s + m] 0 s n m 4

5 String Matching Algorithms: Worst-Case Execution Time Naive Algorithm Preprocessing: 0 Matching: O((n-m+1) +1)m) Overall: O(( ((n-m+1) +1)m) ) Rabin-Karp Preprocessing: Θ(m) Matching: O((n-m+1) +1)m) Overall: O(( ((n-m+1) +1)m) (Better than this on average and in practice) Finite Automaton Preprocess: O(m Σ )) Matching: Θ(n) Overall: O(n + m Σ ) Knuth-Morris-Pratt Preprocessing: Θ(m) Matching: Θ(n) Overall: Θ(n + m) Text: : array T [1...n] Pattern: : array P [1...m] 5

6 Notation & Terminology Σ* * = set of all finite-length strings formed using characters from alphabet Σ Empty string: ε x = length of string x w is a prefix of x: w x ab abcca w is a suffix of x: w x prefix, suffix are transitive cca abcca 6

7 Overlapping Suffix Lemma

8 String Matching Algorithms Naive Algorithm 8

9 Naive String Matching implicit loop worst-case running time is in Θ(( ((n-m+1) +1)m)

10 String Matching Algorithms Rabin-Karp 10

11 Rabin-Karp Algorithm Assume each character is digit in radix-d d notation (e.g. d=10) p = decimal value of pattern Convert to numeric representation for mod operations. t s = decimal value of substring T[s+1..s+m] for s = 0,1...,n-m Strategy: compute p in O(m) time (in O(n)) compute all t i values in total of O(n) time find all valid shifts s in O(n) time by comparing p with each t s Compute p in O(m) time using Horner s rule: p = P[m] + d(p[m-1] + d(p[m-2] d(p[2] + dp[1]))) Compute t 0 similarly from T[1..m] in O(m) time Compute remaining t i s in O(n-m) time t s+1 = d(t s -d m- 1 T[s+1]) + T[s+m+1] rolling window 11

12 Rabin-Karp Algorithm p, t s may be large, so mod by a prime q pattern match

13 Rabin-Karp Algorithm (continued) t s+1 = d(t s -d m-1 T[s+1]) + T[s+m+1] d m-1 mod q (32.2) p = spurious hit 13

14 Rabin-Karp Algorithm (continued) 14

15 Rabin-Karp Algorithm (continued) d is radix; q is modulus Θ(m) high-order digit position for m-digit window Θ(m) Θ((n-m+1)m) Try all possible shifts Θ(m) stopping condition Matching loop invariant: when line 10 executed t s =T [s+1..s+m] mod q rule out spurious hit worst-case running time is in Θ((n-m+1)m) 15

16 Rabin-Karp Algorithm (continued) d is radix; q is modulus Θ(m) high-order digit position for m-digit window Θ((n-m+1)m) Try all possible shifts Θ(m) Θ(m) stopping condition Assume reducing mod q is like random mapping from Σ* to Z q Matching loop invariant: when line 10 executed t s =T[s+1..s+m] mod q rule out spurious hit set of all finite-length strings formed from Σ Estimate (chance that t s = p (mod q)) = 1/q Expected # spurious hits is in O(n/q) Expected matching time = O(n) + O(m(v ( + n/q)) (v = # valid shifts) preprocessing + t s updates If v is in O(1) and q >= m time for explicit matching comparisons average-case running time is in Ο(n+m) 16

17 String Matching Algorithms Finite Automata 17

18 Finite Automata δ 32.6 Strategy: Build automaton for pattern, then examine each text character once. worst-case running time is in Θ(n) + automaton creation time 18

19 Finite Automata Difference: Our automaton will find all occurrences of pattern. 19

20 String-Matching Automaton Pattern = P = ababaca Absent arrows go to state 0. Automaton accepts all strings ending in P; catches all matches

21 String-Matching Automaton Suffix Function for P: σ (x) = length of longest prefix of P that is a suffix of x σ ( x) = max{ k : P x} (32.3) k (32.4) We will build up to this proof Automaton s operational invariant (32.5) i-character prefix of T at each step: : keep track of longest pattern prefix that is a suffix of what has been read so far 21

22 String-Matching Automaton Simulate behavior of string-matching automaton that finds occurrences of pattern P of flength m in T [1..n] We ll show automaton is in state σ(t i ) after scanning character T[i]. Since σ(t i )=m iff P (T i ), machine is in accepting state m iff it has just scanned pattern P. assuming automaton has already been created... worst-case running time of matching is in Θ(n) 22

23 String-Matching Automaton (continued) Correctness of matching procedure Automaton keeps track of longest pattern prefix that is a suffix of what has been read so far in the text. board work (32.4) 32.3 ( xa) σ ( P σ ( x) a) σ = to be proved next 23

24 String-Matching Automaton (continued) Correctness of matching procedure... to be used to prove Lemma = P σ ( xa)

25 String-Matching Automaton (continued) Correctness of matching procedure = P σ ( x) = P σ ( xa )

26 String-Matching Automaton (continued) Correctness of matching procedure is now established (32.4) 32.3 σ ( xa) = σ ( P σ ( x) a) 26

27 String-Matching Automaton (continued) This procedure computes the transition function δ from a given pattern P [1 m]. worst-case running time of automaton creation is in Ο(m 3 Σ ) ) can be improved to: Ο(m Σ ) ) worst-case running time of entire string-matching strategy is in Ο(m Σ ) ) + Ο(n) automaton creation time 27 pattern matching time

28 String Matching Algorithms Knuth-Morris-Pratt 28

29 Knuth-Morris-Pratt Overview Achieve e Θ(n+m) + ) time by shortening automaton preprocessing time below Ο(m Σ ) ) Approach: don t precompute automaton s transition function calculate enough transition data on-the-fly obtain data via alphabet-independent pattern preprocessing pattern preprocessing compares pattern against shifts of itself Use amortization for running time calculation. 29

30 Knuth-Morris-Pratt Algorithm determine how pattern matches against itself

31 Knuth-Morris-Pratt Algorithm 32.6 Equivalently, what is largest k < q such that P k P q? Prefix function π shows how pattern matches against itself π ( q ) = max{ k : k < q and P k Pq } π(q) is length of longest prefix of P that is a proper suffix of P q Example: 31

32 Knuth-Morris-Pratt Algorithm Somewhat similar in structure to FINITE-AUTOMATON AUTOMATON-MATCHER MATCHER Θ(m+n) using amortized analysis (see next slide) Θ(m) Θ(n) using amortized analysis* # characters matched scan text left-to-rightto next character does not match next character matches Is all of P matched? Look for next match 32 *, 2 nd edition uses potential function with Φ = q. 3 rd edition uses aggregate analysis.

33 Knuth-Morris-Pratt Algorithm Amortized Analysis Potential Method Φ = k k represents current state of algorithm Similar in structure to KMP-MATCHER MATCHER Potential is never negative since π (k) >= 0 for all k Θ(m) time initial potential value potential decreases potential increases by <=1 in each execution of for loop body amortized cost of loop body is in Ο(1) Θ(m) loop iterations, 2 nd edition. 3 rd edition uses aggregate analysis to show while loop executes O(m) times overall. 33

34 Knuth-Morris-Pratt Algorithm Correctness... Iterated t prefix function: 34

35 Knuth-Morris-Pratt Algorithm Correctness... 35

36 StringMatch Correctness of Compute-Prefix-Function.. This is nontrivial Lemma (Prefix-function function iteration lemma) Let P be a pattern of length m with prefix function π.. Then, for q =1 1, 2,, m,, we have π * [q] = {k : k < q and P k P q }. Proof. (using > for suffix symbol) 1. π * [q] {k : k < q and P k P q }. Let i = π (u ) [q] for some u > 0. We prove the inclusion by induction on u. For u = 1, we have i = π[q], [ and dth the claim follows since i < q and P π[q] P q. Assume the inclusion holds for i = π (u ) [q]. We need to prove it for i = π (u+1 u+1) [q] = π [π (u ) [q]]. i < π (u ) [q] and P i P π(u)[ )[q]. 4/23/ source: Textbook and Prof. Pecelli

37 StringMatch By induction assumption, P π(u)[ )[q] P q. Transitivity of the relation give that P i P q, as desired. 2. {k : k < q and P k P q } π * [q]. By contradiction. Suppose, to the contrary, that there is an integer in {k k : k < q and P k P q } - π * [q], and let j denote the largest such integer. Since π[q] is the largest value in {k{ : k < q and P k P q q},, and π[q] π * [q], we must have j < π[q].. Let j denote the smallest integer in π * [q]s.t. j > j. j {k : k < qand P k P q } implies P j P q ; j π * [q] implies P j P q. Lemma 32.1 (Overlapping Suffix) implies that P j P j, and j 4/23/2012 is the largest value less than j with this property. 37 source: Textbook and Prof. Pecelli

38 StringMatch This, in turn, forces the conclusion that π[j j ] ] = j and, since j π * [q], we must have j π * [q]. Contradiction. We now continue with another lemma: it is clear that, since π[1] = 0, line 2 of Compute-Prefix- Function provides the correct value. We need to extend this statement to all q > 1. 4/23/ source: Textbook and Prof. Pecelli

39 StringMatch Lemma Let P = P[1 [1 m], and let π be the prefix function for P.. For q = 0, 1,, m,, if π[q] ] > 0, then π[q] -1 π [q-1] 1]. Proof.. If r = π[q]>0 > 0, then r<qand P r P q. Thus r -1 < q -1 and P r-1 P q-1 (by dropping the last characters from P r and P q ). Lemma 32.5 implies that π[q] - 1 = r -1 π [q-1] 1]. 4/23/ source: Textbook and Prof. Pecelli

40 StringMatch We now introduce a new set: for q = 23 2, 3,, m, define E q-1 π [q-1] by: E q-1 = {k π 1 [q-1]: P[k+1] [ ] = P[q]} = {k : k < q-1 and P k P q-1 and P[k+ k+1] = P[q]} = {k : k < q-1 1 and P k+1 P q }. In other words, E q-1 consists of the values k < q -1 for which P k P q-1 and for which P k+1 P q, because P[k+1] = P[q]. E q-1 consists of those values k π [q-1] for which we can extend P k to P k+1 and still get a proper suffix of P q. 4/23/ source: Textbook and Prof. Pecelli

41 StringMatch Corollary Let P be a pattern of length m,, and let p be the prefix function for P.. For q = 2, 3,, m, π [ q ] = = 1+ max{ k E } q 1 if Eq 1. 0 if E q 1, Proof.. Case 1: E q-1 is empty. There is no k π [q-1] (including k = 0) for which we can extend P k to P k+1 and get a proper suffix of P q. Thus π[q] ] = 0. Case 2: E q-1 is not empty. 1. Prove π[q] ] 1 + max{k Ε q-1 }. For each k Ε q-1 we have k+1< 1<q q and P k+1 P q. The definition of π[q] gives the inequality. 4/23/ source: Textbook and Prof. Pecelli

42 StringMatch 2. Prove that π[q] ] 1 + max{k Ε q-1 }. Since E q-1 is non-empty, π[q] ] > 0. Let r = π[q] -1, hence r + 1 = π[q].. Since r+1>0 1 > 0, P[r +1]=P[q] P[q].. By Lemma 32.6 we also have r = π[q] -1 π [q -1] 1]. Therefore r Ε q-1, which implies r max{k Ε q-1 } and, immediately, the desired inequality. Combining i both inequalities, we have the result. Now glue all these results together to obtain a proof of correctness. 4/23/ source: Textbook and Prof. Pecelli

Graduate Algorithms CS F-20 String Matching

Graduate Algorithms CS F-20 String Matching Graduate Algorithms CS673-2016F-20 String Matching David Galles Department of Computer Science University of San Francisco 20-0: String Matching Given a source text, and a string to match, where does the

More information

Overview. Knuth-Morris-Pratt & Boyer-Moore Algorithms. Notation Review (2) Notation Review (1) The Kunth-Morris-Pratt (KMP) Algorithm

Overview. Knuth-Morris-Pratt & Boyer-Moore Algorithms. Notation Review (2) Notation Review (1) The Kunth-Morris-Pratt (KMP) Algorithm Knuth-Morris-Pratt & s by Robert C. St.Pierre Overview Notation review Knuth-Morris-Pratt algorithm Discussion of the Algorithm Example Boyer-Moore algorithm Discussion of the Algorithm Example Applications

More information

Algorithms: COMP3121/3821/9101/9801

Algorithms: COMP3121/3821/9101/9801 NEW SOUTH WALES Algorithms: COMP3121/3821/9101/9801 Aleks Ignjatović School of Computer Science and Engineering University of New South Wales LECTURE 8: STRING MATCHING ALGORITHMS COMP3121/3821/9101/9801

More information

Efficient Sequential Algorithms, Comp309

Efficient Sequential Algorithms, Comp309 Efficient Sequential Algorithms, Comp309 University of Liverpool 2010 2011 Module Organiser, Igor Potapov Part 2: Pattern Matching References: T. H. Cormen, C. E. Leiserson, R. L. Rivest Introduction to

More information

Lecture 3: String Matching

Lecture 3: String Matching COMP36111: Advanced Algorithms I Lecture 3: String Matching Ian Pratt-Hartmann Room KB2.38: email: ipratt@cs.man.ac.uk 2017 18 Outline The string matching problem The Rabin-Karp algorithm The Knuth-Morris-Pratt

More information

String Search. 6th September 2018

String Search. 6th September 2018 String Search 6th September 2018 Search for a given (short) string in a long string Search problems have become more important lately The amount of stored digital information grows steadily (rapidly?)

More information

Module 9: Tries and String Matching

Module 9: Tries and String Matching Module 9: Tries and String Matching CS 240 - Data Structures and Data Management Sajed Haque Veronika Irvine Taylor Smith Based on lecture notes by many previous cs240 instructors David R. Cheriton School

More information

INF 4130 / /8-2017

INF 4130 / /8-2017 INF 4130 / 9135 28/8-2017 Algorithms, efficiency, and complexity Problem classes Problems can be divided into sets (classes). Problem classes are defined by the type of algorithm that can (or cannot) solve

More information

2. Exact String Matching

2. Exact String Matching 2. Exact String Matching Let T = T [0..n) be the text and P = P [0..m) the pattern. We say that P occurs in T at position j if T [j..j + m) = P. Example: P = aine occurs at position 6 in T = karjalainen.

More information

INF 4130 / /8-2014

INF 4130 / /8-2014 INF 4130 / 9135 26/8-2014 Mandatory assignments («Oblig-1», «-2», and «-3»): All three must be approved Deadlines around: 25. sept, 25. oct, and 15. nov Other courses on similar themes: INF-MAT 3370 INF-MAT

More information

Data Structures and Algorithm. Xiaoqing Zheng

Data Structures and Algorithm. Xiaoqing Zheng Dt Strutures nd Algorithm Xioqing Zheng zhengxq@fudn.edu.n String mthing prolem Pttern P ours with shift s in text T (or, equivlently, tht pttern P ours eginning t position s + in text T) if T[s +... s

More information

Knuth-Morris-Pratt Algorithm

Knuth-Morris-Pratt Algorithm Knuth-Morris-Pratt Algorithm Jayadev Misra June 5, 2017 The Knuth-Morris-Pratt string matching algorithm (KMP) locates all occurrences of a pattern string in a text string in linear time (in the combined

More information

Pattern Matching. a b a c a a b. a b a c a b. a b a c a b. Pattern Matching 1

Pattern Matching. a b a c a a b. a b a c a b. a b a c a b. Pattern Matching 1 Pattern Matching a b a c a a b 1 4 3 2 Pattern Matching 1 Outline and Reading Strings ( 9.1.1) Pattern matching algorithms Brute-force algorithm ( 9.1.2) Boyer-Moore algorithm ( 9.1.3) Knuth-Morris-Pratt

More information

String Matching. Thanks to Piotr Indyk. String Matching. Simple Algorithm. for s 0 to n-m. Match 0. for j 1 to m if T[s+j] P[j] then

String Matching. Thanks to Piotr Indyk. String Matching. Simple Algorithm. for s 0 to n-m. Match 0. for j 1 to m if T[s+j] P[j] then String Matching Thanks to Piotr Indyk String Matching Input: Two strings T[1 n] and P[1 m], containing symbols from alphabet Σ Goal: find all shifts 0 s n-m such that T[s+1 s+m]=p Example: Σ={,a,b,,z}

More information

String Matching. Jayadev Misra The University of Texas at Austin December 5, 2003

String Matching. Jayadev Misra The University of Texas at Austin December 5, 2003 String Matching Jayadev Misra The University of Texas at Austin December 5, 2003 Contents 1 Introduction 1 2 Rabin-Karp Algorithm 3 3 Knuth-Morris-Pratt Algorithm 5 3.1 Informal Description.........................

More information

Pattern Matching. a b a c a a b. a b a c a b. a b a c a b. Pattern Matching Goodrich, Tamassia

Pattern Matching. a b a c a a b. a b a c a b. a b a c a b. Pattern Matching Goodrich, Tamassia Pattern Matching a b a c a a b 1 4 3 2 Pattern Matching 1 Brute-Force Pattern Matching ( 11.2.1) The brute-force pattern matching algorithm compares the pattern P with the text T for each possible shift

More information

15 Text search. P.D. Dr. Alexander Souza. Winter term 11/12

15 Text search. P.D. Dr. Alexander Souza. Winter term 11/12 Algorithms Theory 15 Text search P.D. Dr. Alexander Souza Text search Various scenarios: Dynamic texts Text editors Symbol manipulators Static texts Literature databases Library systems Gene databases

More information

Proofs, Strings, and Finite Automata. CS154 Chris Pollett Feb 5, 2007.

Proofs, Strings, and Finite Automata. CS154 Chris Pollett Feb 5, 2007. Proofs, Strings, and Finite Automata CS154 Chris Pollett Feb 5, 2007. Outline Proofs and Proof Strategies Strings Finding proofs Example: For every graph G, the sum of the degrees of all the nodes in G

More information

Algorithm Theory. 13 Text Search - Knuth, Morris, Pratt, Boyer, Moore. Christian Schindelhauer

Algorithm Theory. 13 Text Search - Knuth, Morris, Pratt, Boyer, Moore. Christian Schindelhauer Algorithm Theory 13 Text Search - Knuth, Morris, Pratt, Boyer, Moore Institut für Informatik Wintersemester 2007/08 Text Search Scenarios Static texts Literature databases Library systems Gene databases

More information

All three must be approved Deadlines around: 21. sept, 26. okt, and 16. nov

All three must be approved Deadlines around: 21. sept, 26. okt, and 16. nov INF 4130 / 9135 29/8-2012 Today s slides are produced mainly by Petter Kristiansen Lecturer Stein Krogdahl Mandatory assignments («Oblig1», «-2», and «-3»): All three must be approved Deadlines around:

More information

Knuth-Morris-Pratt Algorithm

Knuth-Morris-Pratt Algorithm Knuth-Morris-Pratt Algorithm The roblem of tring Matching Given a string, the roblem of string matching deals with finding whether a attern occurs in and if does occur then returning osition in where occurs.

More information

New Minimal Weight Representations for Left-to-Right Window Methods

New Minimal Weight Representations for Left-to-Right Window Methods New Minimal Weight Representations for Left-to-Right Window Methods James A. Muir 1 and Douglas R. Stinson 2 1 Department of Combinatorics and Optimization 2 School of Computer Science University of Waterloo

More information

Lecture 9 Tuesday, 4/20/10. Linear Programming

Lecture 9 Tuesday, 4/20/10. Linear Programming UMass Lowell Computer Science 91.503 Analysis of Algorithms Prof. Karen Daniels Spring, 2010 Lecture 9 Tuesday, 4/20/10 Linear Programming 1 Overview Motivation & Basics Standard & Slack Forms Formulating

More information

Pattern Matching (Exact Matching) Overview

Pattern Matching (Exact Matching) Overview CSI/BINF 5330 Pattern Matching (Exact Matching) Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Pattern Matching Exhaustive Search DFA Algorithm KMP Algorithm

More information

Maximal Unbordered Factors of Random Strings arxiv: v1 [cs.ds] 14 Apr 2017

Maximal Unbordered Factors of Random Strings arxiv: v1 [cs.ds] 14 Apr 2017 Maximal Unbordered Factors of Random Strings arxiv:1704.04472v1 [cs.ds] 14 Apr 2017 Patrick Hagge Cording 1 and Mathias Bæk Tejs Knudsen 2 1 DTU Compute, Technical University of Denmark, phaco@dtu.dk 2

More information

String Regularities and Degenerate Strings

String Regularities and Degenerate Strings M. Sc. Thesis Defense Md. Faizul Bari (100705050P) Supervisor: Dr. M. Sohel Rahman String Regularities and Degenerate Strings Department of Computer Science and Engineering Bangladesh University of Engineering

More information

Linear Selection and Linear Sorting

Linear Selection and Linear Sorting Analysis of Algorithms Linear Selection and Linear Sorting If more than one question appears correct, choose the more specific answer, unless otherwise instructed. Concept: linear selection 1. Suppose

More information

Lecture 5: The Shift-And Method

Lecture 5: The Shift-And Method Biosequence Algorithms, Spring 2005 Lecture 5: The Shift-And Method Pekka Kilpeläinen University of Kuopio Department of Computer Science BSA Lecture 5: Shift-And p.1/19 Seminumerical String Matching Most

More information

Searching Sear ( Sub- (Sub )Strings Ulf Leser

Searching Sear ( Sub- (Sub )Strings Ulf Leser Searching (Sub-)Strings Ulf Leser This Lecture Exact substring search Naïve Boyer-Moore Searching with profiles Sequence profiles Ungapped approximate search Statistical evaluation of search results Ulf

More information

A GREEDY APPROXIMATION ALGORITHM FOR CONSTRUCTING SHORTEST COMMON SUPERSTRINGS *

A GREEDY APPROXIMATION ALGORITHM FOR CONSTRUCTING SHORTEST COMMON SUPERSTRINGS * A GREEDY APPROXIMATION ALGORITHM FOR CONSTRUCTING SHORTEST COMMON SUPERSTRINGS * 1 Jorma Tarhio and Esko Ukkonen Department of Computer Science, University of Helsinki Tukholmankatu 2, SF-00250 Helsinki,

More information

Pattern-Matching for Strings with Short Descriptions

Pattern-Matching for Strings with Short Descriptions Pattern-Matching for Strings with Short Descriptions Marek Karpinski marek@cs.uni-bonn.de Department of Computer Science, University of Bonn, 164 Römerstraße, 53117 Bonn, Germany Wojciech Rytter rytter@mimuw.edu.pl

More information

Hash tables. Hash tables

Hash tables. Hash tables Basic Probability Theory Two events A, B are independent if Conditional probability: Pr[A B] = Pr[A] Pr[B] Pr[A B] = Pr[A B] Pr[B] The expectation of a (discrete) random variable X is E[X ] = k k Pr[X

More information

Subset construction. We have defined for a DFA L(A) = {x Σ ˆδ(q 0, x) F } and for A NFA. For any NFA A we can build a DFA A D such that L(A) = L(A D )

Subset construction. We have defined for a DFA L(A) = {x Σ ˆδ(q 0, x) F } and for A NFA. For any NFA A we can build a DFA A D such that L(A) = L(A D ) Search algorithm Clever algorithm even for a single word Example: find abac in abaababac See Knuth-Morris-Pratt and String searching algorithm on wikipedia 2 Subset construction We have defined for a DFA

More information

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction.

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction. Microsoft Research Asia September 5, 2005 1 2 3 4 Section I What is? Definition is a technique for efficiently recurrence computing by storing partial results. In this slides, I will NOT use too many formal

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17 12.1 Introduction Today we re going to do a couple more examples of dynamic programming. While

More information

Approximate Pattern Matching and the Query Complexity of Edit Distance

Approximate Pattern Matching and the Query Complexity of Edit Distance Krzysztof Onak Approximate Pattern Matching p. 1/20 Approximate Pattern Matching and the Query Complexity of Edit Distance Joint work with: Krzysztof Onak MIT Alexandr Andoni (CCI) Robert Krauthgamer (Weizmann

More information

Sri vidya college of engineering and technology

Sri vidya college of engineering and technology Unit I FINITE AUTOMATA 1. Define hypothesis. The formal proof can be using deductive proof and inductive proof. The deductive proof consists of sequence of statements given with logical reasoning in order

More information

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever.  ETH Zürich (D-ITET) September, Automata & languages A primer on the Theory of Computation Laurent Vanbever www.vanbever.eu ETH Zürich (D-ITET) September, 24 2015 Last week was all about Deterministic Finite Automaton We saw three main

More information

Improving the KMP Algorithm by Using Properties of Fibonacci String

Improving the KMP Algorithm by Using Properties of Fibonacci String Improving the KMP Algorithm by Using Properties of Fibonacci String Yi-Kung Shieh and R. C. T. Lee Department of Computer Science National Tsing Hua University d9762814@oz.nthu.edu.tw and rctlee@ncnu.edu.tw

More information

Where did dynamic programming come from?

Where did dynamic programming come from? Where did dynmic progrmming come from? String lgorithms Dvid Kuchk cs302 Spring 2012 Richrd ellmn On the irth of Dynmic Progrmming Sturt Dreyfus http://www.eng.tu.c.il/~mi/cd/ or50/1526-5463-2002-50-01-0048.pdf

More information

On Boyer-Moore Preprocessing

On Boyer-Moore Preprocessing On Boyer-Moore reprocessing Heikki Hyyrö Department of Computer Sciences University of Tampere, Finland Heikki.Hyyro@cs.uta.fi Abstract robably the two best-known exact string matching algorithms are the

More information

PATTERN MATCHING WITH SWAPS IN PRACTICE

PATTERN MATCHING WITH SWAPS IN PRACTICE International Journal of Foundations of Computer Science c World Scientific Publishing Company PATTERN MATCHING WITH SWAPS IN PRACTICE MATTEO CAMPANELLI Università di Catania, Scuola Superiore di Catania

More information

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS Automata Theory Lecture on Discussion Course of CS2 This Lecture is about Mathematical Models of Computation. Why Should I Care? - Ways of thinking. - Theory can drive practice. - Don t be an Instrumentalist.

More information

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata CISC 4090: Theory of Computation Chapter Regular Languages Xiaolan Zhang, adapted from slides by Prof. Werschulz Section.: Finite Automata Fordham University Department of Computer and Information Sciences

More information

Deterministic Finite Automaton (DFA)

Deterministic Finite Automaton (DFA) 1 Lecture Overview Deterministic Finite Automata (DFA) o accepting a string o defining a language Nondeterministic Finite Automata (NFA) o converting to DFA (subset construction) o constructed from a regular

More information

CSC236 Week 10. Larry Zhang

CSC236 Week 10. Larry Zhang CSC236 Week 10 Larry Zhang 1 Today s Topic Deterministic Finite Automata (DFA) 2 Recap of last week We learned a lot of terminologies alphabet string length of string union concatenation Kleene star language

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION FORMAL LANGUAGES, AUTOMATA AND COMPUTATION DECIDABILITY ( LECTURE 15) SLIDES FOR 15-453 SPRING 2011 1 / 34 TURING MACHINES-SYNOPSIS The most general model of computation Computations of a TM are described

More information

Text Searching. Thierry Lecroq Laboratoire d Informatique, du Traitement de l Information et des

Text Searching. Thierry Lecroq Laboratoire d Informatique, du Traitement de l Information et des Text Searching Thierry Lecroq Thierry.Lecroq@univ-rouen.fr Laboratoire d Informatique, du Traitement de l Information et des Systèmes. International PhD School in Formal Languages and Applications Tarragona,

More information

Computability and Complexity

Computability and Complexity Computability and Complexity Lecture 5 Reductions Undecidable problems from language theory Linear bounded automata given by Jiri Srba Lecture 5 Computability and Complexity 1/14 Reduction Informal Definition

More information

Samson Zhou. Pattern Matching over Noisy Data Streams

Samson Zhou. Pattern Matching over Noisy Data Streams Samson Zhou Pattern Matching over Noisy Data Streams Finding Structure in Data Pattern Matching Finding all instances of a pattern within a string ABCD ABCAABCDAACAABCDBCABCDADDDEAEABCDA Knuth-Morris-Pratt

More information

String Range Matching

String Range Matching String Range Matching Juha Kärkkäinen, Dominik Kempa, and Simon J. Puglisi Department of Computer Science, University of Helsinki Helsinki, Finland firstname.lastname@cs.helsinki.fi Abstract. Given strings

More information

CSCE 551: Chin-Tser Huang. University of South Carolina

CSCE 551: Chin-Tser Huang. University of South Carolina CSCE 551: Theory of Computation Chin-Tser Huang huangct@cse.sc.edu University of South Carolina Computation History A computation history of a TM M is a sequence of its configurations C 1, C 2,, C l such

More information

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever.   ETH Zürich (D-ITET) October, Automata & languages A primer on the Theory of Computation Laurent Vanbever www.vanbever.eu ETH Zürich (D-ITET) October, 5 2017 Part 3 out of 5 Last week, we learned about closure and equivalence of regular

More information

Part 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages

Part 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages Automata & languages A primer on the Theory of Computation Laurent Vanbever www.vanbever.eu Part 3 out of 5 ETH Zürich (D-ITET) October, 5 2017 Last week, we learned about closure and equivalence of regular

More information

Optimal Superprimitivity Testing for Strings

Optimal Superprimitivity Testing for Strings Optimal Superprimitivity Testing for Strings Alberto Apostolico Martin Farach Costas S. Iliopoulos Fibonacci Report 90.7 August 1990 - Revised: March 1991 Abstract A string w covers another string z if

More information

String Matching II. Algorithm : Design & Analysis [19]

String Matching II. Algorithm : Design & Analysis [19] String Matching II Algorithm : Design & Analysis [19] In the last class Simple String Matching KMP Flowchart Construction Jump at Fail KMP Scan String Matching II Boyer-Moore s heuristics Skipping unnecessary

More information

CS243, Logic and Computation Nondeterministic finite automata

CS243, Logic and Computation Nondeterministic finite automata CS243, Prof. Alvarez NONDETERMINISTIC FINITE AUTOMATA (NFA) Prof. Sergio A. Alvarez http://www.cs.bc.edu/ alvarez/ Maloney Hall, room 569 alvarez@cs.bc.edu Computer Science Department voice: (67) 552-4333

More information

Deterministic Finite Automata (DFAs)

Deterministic Finite Automata (DFAs) Algorithms & Models of Computation CS/ECE 374, Fall 27 Deterministic Finite Automata (DFAs) Lecture 3 Tuesday, September 5, 27 Sariel Har-Peled (UIUC) CS374 Fall 27 / 36 Part I DFA Introduction Sariel

More information

Decision Problems with TM s. Lecture 31: Halting Problem. Universe of discourse. Semi-decidable. Look at following sets: CSCI 81 Spring, 2012

Decision Problems with TM s. Lecture 31: Halting Problem. Universe of discourse. Semi-decidable. Look at following sets: CSCI 81 Spring, 2012 Decision Problems with TM s Look at following sets: Lecture 31: Halting Problem CSCI 81 Spring, 2012 Kim Bruce A TM = { M,w M is a TM and w L(M)} H TM = { M,w M is a TM which halts on input w} TOTAL TM

More information

3515ICT: Theory of Computation. Regular languages

3515ICT: Theory of Computation. Regular languages 3515ICT: Theory of Computation Regular languages Notation and concepts concerning alphabets, strings and languages, and identification of languages with problems (H, 1.5). Regular expressions (H, 3.1,

More information

highlights CSE 311: Foundations of Computing highlights 1 in third position from end Fall 2013 Lecture 25: Non-regularity and limits of FSMs

highlights CSE 311: Foundations of Computing highlights 1 in third position from end Fall 2013 Lecture 25: Non-regularity and limits of FSMs CSE 3: Foundations of Computing Fall 23 Lecture 25: Non-regularity and limits of FSMs highlights NFAs from Regular Epressions ( )* highlights in third position from end Subset construction : NFA to DFA

More information

Hashing Techniques For Finite Automata

Hashing Techniques For Finite Automata Hashing Techniques For Finite Automata Hady Zeineddine Logic Synthesis Course Project - Spring 2007 Professor Adnan Aziz 1. Abstract This report presents two hashing techniques - Alphabet and Power-Set

More information

String Matching with Variable Length Gaps

String Matching with Variable Length Gaps String Matching with Variable Length Gaps Philip Bille, Inge Li Gørtz, Hjalte Wedel Vildhøj, and David Kofoed Wind Technical University of Denmark Abstract. We consider string matching with variable length

More information

6. DYNAMIC PROGRAMMING II

6. DYNAMIC PROGRAMMING II 6. DYNAMIC PROGRAMMING II sequence alignment Hirschberg's algorithm Bellman-Ford algorithm distance vector protocols negative cycles in a digraph Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison

More information

Foundations of

Foundations of 91.304 Foundations of (Theoretical) Computer Science Chapter 3 Lecture Notes (Section 3.2: Variants of Turing Machines) David Martin dm@cs.uml.edu With some modifications by Prof. Karen Daniels, Fall 2012

More information

A Pattern Matching Algorithm Using Deterministic Finite Automata with Infixes Checking. Jung-Hua Hsu

A Pattern Matching Algorithm Using Deterministic Finite Automata with Infixes Checking. Jung-Hua Hsu A Pattern Matching Algorithm Using Deterministic Finite Automata with Infixes Checking Jung-Hua Hsu A Pattern Matching Algorithm Using Deterministic Finite Automata with Infixes Checking Student:Jung-Hua

More information

Announcements. CompSci 102 Discrete Math for Computer Science. Chap. 3.1 Algorithms. Specifying Algorithms

Announcements. CompSci 102 Discrete Math for Computer Science. Chap. 3.1 Algorithms. Specifying Algorithms CompSci 102 Discrete Math for Computer Science Announcements Read for next time Chap. 3.1-3.3 Homework 3 due Tuesday We ll finish Chapter 2 first today February 7, 2012 Prof. Rodger Chap. 3.1 Algorithms

More information

Theoretical Computer Science. Efficient string-matching allowing for non-overlapping inversions

Theoretical Computer Science. Efficient string-matching allowing for non-overlapping inversions Theoretical Computer Science 483 (2013) 85 95 Contents lists available at SciVerse ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Efficient string-matching allowing

More information

CSCI Honor seminar in algorithms Homework 2 Solution

CSCI Honor seminar in algorithms Homework 2 Solution CSCI 493.55 Honor seminar in algorithms Homework 2 Solution Saad Mneimneh Visiting Professor Hunter College of CUNY Problem 1: Rabin-Karp string matching Consider a binary string s of length n and another

More information

15.1 Proof of the Cook-Levin Theorem: SAT is NP-complete

15.1 Proof of the Cook-Levin Theorem: SAT is NP-complete CS125 Lecture 15 Fall 2016 15.1 Proof of the Cook-Levin Theorem: SAT is NP-complete Already know SAT NP, so only need to show SAT is NP-hard. Let L be any language in NP. Let M be a NTM that decides L

More information

Mid-term Exam Answers and Final Exam Study Guide CIS 675 Summer 2010

Mid-term Exam Answers and Final Exam Study Guide CIS 675 Summer 2010 Mid-term Exam Answers and Final Exam Study Guide CIS 675 Summer 2010 Midterm Problem 1: Recall that for two functions g : N N + and h : N N +, h = Θ(g) iff for some positive integer N and positive real

More information

Online Computation of Abelian Runs

Online Computation of Abelian Runs Online Computation of Abelian Runs Gabriele Fici 1, Thierry Lecroq 2, Arnaud Lefebvre 2, and Élise Prieur-Gaston2 1 Dipartimento di Matematica e Informatica, Università di Palermo, Italy Gabriele.Fici@unipa.it

More information

Classes and conversions

Classes and conversions Classes and conversions Regular expressions Syntax: r = ε a r r r + r r Semantics: The language L r of a regular expression r is inductively defined as follows: L =, L ε = {ε}, L a = a L r r = L r L r

More information

Efficient Polynomial-Time Algorithms for Variants of the Multiple Constrained LCS Problem

Efficient Polynomial-Time Algorithms for Variants of the Multiple Constrained LCS Problem Efficient Polynomial-Time Algorithms for Variants of the Multiple Constrained LCS Problem Hsing-Yen Ann National Center for High-Performance Computing Tainan 74147, Taiwan Chang-Biau Yang and Chiou-Ting

More information

Finding all covers of an indeterminate string in O(n) time on average

Finding all covers of an indeterminate string in O(n) time on average Finding all covers of an indeterminate string in O(n) time on average Md. Faizul Bari, M. Sohel Rahman, and Rifat Shahriyar Department of Computer Science and Engineering Bangladesh University of Engineering

More information

State Complexity of Neighbourhoods and Approximate Pattern Matching

State Complexity of Neighbourhoods and Approximate Pattern Matching State Complexity of Neighbourhoods and Approximate Pattern Matching Timothy Ng, David Rappaport, and Kai Salomaa School of Computing, Queen s University, Kingston, Ontario K7L 3N6, Canada {ng, daver, ksalomaa}@cs.queensu.ca

More information

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX Automata & languages A primer on the Theory of Computation Laurent Vanbever www.vanbever.eu Part 4 out of 5 ETH Zürich (D-ITET) October, 12 2017 Last week, we showed the equivalence of DFA, NFA and REX

More information

Deterministic Finite Automata (DFAs)

Deterministic Finite Automata (DFAs) Algorithms & Models of Computation CS/ECE 374, Spring 29 Deterministic Finite Automata (DFAs) Lecture 3 Tuesday, January 22, 29 L A TEXed: December 27, 28 8:25 Chan, Har-Peled, Hassanieh (UIUC) CS374 Spring

More information

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30 CSCB63 Winter 2019 Week10 - Lecture 2 - Hashing Anna Bretscher March 21, 2019 1 / 30 Today Hashing Open Addressing Hash functions Universal Hashing 2 / 30 Open Addressing Open Addressing. Each entry in

More information

Deterministic Finite Automata (DFAs)

Deterministic Finite Automata (DFAs) CS/ECE 374: Algorithms & Models of Computation, Fall 28 Deterministic Finite Automata (DFAs) Lecture 3 September 4, 28 Chandra Chekuri (UIUC) CS/ECE 374 Fall 28 / 33 Part I DFA Introduction Chandra Chekuri

More information

Three new strategies for exact string matching

Three new strategies for exact string matching Three new strategies for exact string matching Simone Faro 1 Thierry Lecroq 2 1 University of Catania, Italy 2 University of Rouen, LITIS EA 4108, France SeqBio 2012 November 26th-27th 2012 Marne-la-Vallée,

More information

Computability Theory

Computability Theory CS:4330 Theory of Computation Spring 2018 Computability Theory Decidable Languages Haniel Barbosa Readings for this lecture Chapter 4 of [Sipser 1996], 3rd edition. Section 4.1. Decidable Languages We

More information

A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms

A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms Simone Faro Thierry Lecroq University of Catania, Italy University of Rouen, LITIS EA 4108, France Symposium on Eperimental Algorithms

More information

Theoretical Computer Science

Theoretical Computer Science Theoretical Computer Science 443 (2012) 25 34 Contents lists available at SciVerse ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs String matching with variable

More information

Part I: Definitions and Properties

Part I: Definitions and Properties Turing Machines Part I: Definitions and Properties Finite State Automata Deterministic Automata (DFSA) M = {Q, Σ, δ, q 0, F} -- Σ = Symbols -- Q = States -- q 0 = Initial State -- F = Accepting States

More information

Integer Sorting on the word-ram

Integer Sorting on the word-ram Integer Sorting on the word-rm Uri Zwick Tel viv University May 2015 Last updated: June 30, 2015 Integer sorting Memory is composed of w-bit words. rithmetical, logical and shift operations on w-bit words

More information

SUBSTRING SEARCH BBM ALGORITHMS DEPT. OF COMPUTER ENGINEERING

SUBSTRING SEARCH BBM ALGORITHMS DEPT. OF COMPUTER ENGINEERING BBM 202 - LGORITHMS DEPT. OF OMPUTER ENGINEERING SUBSTRING SERH cknowledgement: The course slides are adapted from the slides prepared by R. Sedgewick and K. Wayne of Princeton University. 1 TODY Substring

More information

October 6, Equivalence of Pushdown Automata with Context-Free Gramm

October 6, Equivalence of Pushdown Automata with Context-Free Gramm Equivalence of Pushdown Automata with Context-Free Grammar October 6, 2013 Motivation Motivation CFG and PDA are equivalent in power: a CFG generates a context-free language and a PDA recognizes a context-free

More information

Advanced Automata Theory 7 Automatic Functions

Advanced Automata Theory 7 Automatic Functions Advanced Automata Theory 7 Automatic Functions Frank Stephan Department of Computer Science Department of Mathematics National University of Singapore fstephan@comp.nus.edu.sg Advanced Automata Theory

More information

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT. Recap DFA,NFA, DTM Slides by Prof. Debasis Mitra, FIT. 1 Formal Language Finite set of alphabets Σ: e.g., {0, 1}, {a, b, c}, { {, } } Language L is a subset of strings on Σ, e.g., {00, 110, 01} a finite

More information

Lecture 3: Finite Automata. Finite Automata. Deterministic Finite Automata. Summary. Dr Kieran T. Herley

Lecture 3: Finite Automata. Finite Automata. Deterministic Finite Automata. Summary. Dr Kieran T. Herley Lecture 3: Finite Automata Dr Kieran T. Herley Department of Computer Science University College Cork Summary Deterministic finite automata (DFA). Definition and operation of same. DFAs as string classifiers

More information

Computational Models: Class 3

Computational Models: Class 3 Computational Models: Class 3 Benny Chor School of Computer Science Tel Aviv University November 2, 2015 Based on slides by Maurice Herlihy, Brown University, and modifications by Iftach Haitner and Yishay

More information

Hash tables. Hash tables

Hash tables. Hash tables Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key

More information

Reducability. Sipser, pages

Reducability. Sipser, pages Reducability Sipser, pages 187-214 Reduction Reduction encodes (transforms) one problem as a second problem. A solution to the second, can be transformed into a solution to the first. We expect both transformations

More information

Hash tables. Hash tables

Hash tables. Hash tables Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key

More information

How do regular expressions work? CMSC 330: Organization of Programming Languages

How do regular expressions work? CMSC 330: Organization of Programming Languages How do regular expressions work? CMSC 330: Organization of Programming Languages Regular Expressions and Finite Automata What we ve learned What regular expressions are What they can express, and cannot

More information

arxiv: v2 [cs.ds] 1 Feb 2015

arxiv: v2 [cs.ds] 1 Feb 2015 Online Detection of Repetitions with Backtracking Dmitry Kosolobov arxiv:1412.4471v2 [cs.ds] 1 Feb 2015 Ural Federal University, Ekaterinburg, Russia dkosolobov@mail.ru Abstract. In this paper we present

More information

= 1 2x. x 2 a ) 0 (mod p n ), (x 2 + 2a + a2. x a ) 2

= 1 2x. x 2 a ) 0 (mod p n ), (x 2 + 2a + a2. x a ) 2 8. p-adic numbers 8.1. Motivation: Solving x 2 a (mod p n ). Take an odd prime p, and ( an) integer a coprime to p. Then, as we know, x 2 a (mod p) has a solution x Z iff = 1. In this case we can suppose

More information

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata. CMSC 330: Organization of Programming Languages Last Lecture Languages Sets of strings Operations on languages Finite Automata Regular expressions Constants Operators Precedence CMSC 330 2 Clarifications

More information

Algorithms Design & Analysis. String matching

Algorithms Design & Analysis. String matching Algorithms Design & Analysis String matching Greedy algorithm Recap 2 Today s topics KM algorithm Suffix tree Approximate string matching 3 String Matching roblem Given a text string T of length n and

More information