Data Structures and Algorithm. Xiaoqing Zheng

Similar documents
Prefix-Free Regular-Expression Matching

Algorithms: COMP3121/3821/9101/9801

Graduate Algorithms CS F-20 String Matching

Overview. Knuth-Morris-Pratt & Boyer-Moore Algorithms. Notation Review (2) Notation Review (1) The Kunth-Morris-Pratt (KMP) Algorithm

Fingerprint idea. Assume:

Where did dynamic programming come from?

Module 9: Tries and String Matching

Module 9: Tries and String Matching

Analysis of Algorithms Prof. Karen Daniels

Minimal DFA. minimal DFA for L starting from any other

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

Finite State Automata and Determinisation

Nondeterministic Finite Automata

Chapter 2 Finite Automata

Running an NFA & the subset algorithm (NFA->DFA) CS 350 Fall 2018 gilray.org/classes/fall2018/cs350/

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

NFAs continued, Closure Properties of Regular Languages

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

Homework 3 Solutions

Efficient Sequential Algorithms, Comp309

Nondeterministic Automata vs Deterministic Automata

Formal Languages and Automata

Module 9: Tries and String Matching

NFAs continued, Closure Properties of Regular Languages

Compression of Palindromes and Regularity.

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

11.1 Finite Automata. CS125 Lecture 11 Fall Motivation: TMs without a tape: maybe we can at least fully understand such a simple model?

@#? Text Search ] { "!" Nondeterministic Finite Automata. Transformation NFA to DFA and Simulation of NFA. Text Search Using Automata

Languages & Automata

CISC 4090 Theory of Computation

CHAPTER 1 Regular Languages. Contents

Fundamentals of Computer Science

Regular expressions, Finite Automata, transition graphs are all the same!!

NON-DETERMINISTIC FSA

First Midterm Examination

Technische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution

Subsequence Automata with Default Transitions

1 Nondeterministic Finite Automata

Designing finite automata II

CS 573 Automata Theory and Formal Languages

Regular Languages and Applications

Lecture 3: String Matching

2.4 Theoretical Foundations

Worked out examples Finite Automata

= state, a = reading and q j

Theory of Computation Regular Languages

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

Finite-State Automata: Recap

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.

CS241 Week 6 Tutorial Solutions

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

Deciding the value 1 problem for probabilistic leaktight automata

Convert the NFA into DFA

Coalgebra, Lecture 15: Equations for Deterministic Automata

Regular languages refresher

Lecture 08: Feb. 08, 2019

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers. Mehryar Mohri Courant Institute and Google Research

Section: Other Models of Turing Machines. Definition: Two automata are equivalent if they accept the same language.

Winter 2016 COMP-250: Introduction to Computer Science. Lecture 23, April 5, 2016

Let's start with an example:

Balanced binary search trees

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

1.4 Nonregular Languages

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

State Complexity of Union and Intersection of Binary Suffix-Free Languages

Lecture 9: LTL and Büchi Automata

Thoery of Automata CS402

Discrete Structures, Test 2 Monday, March 28, 2016 SOLUTIONS, VERSION α

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings...

Java II Finite Automata I

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Fast index for approximate string matching

Closure Properties of Regular Languages

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers

Deterministic Finite Automata

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

XML and Databases. Outline. 1. Top-Down Evaluation of Simple Paths. 1. Top-Down Evaluation of Simple Paths. 1. Top-Down Evaluation of Simple Paths

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS

Chapter 4 Regular Grammar and Regular Sets. (Solutions / Hints)

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

NFA and regex. the Boolean algebra of languages. non-deterministic machines. regular expressions

On-Line Construction. of Suffix Trees. Overview. Suffix Trees. Notations. goo. Suffix tries

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

Periodic string comparison

Nondeterminism and Nodeterministic Automata

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

Algorithm Theory. 13 Text Search - Knuth, Morris, Pratt, Boyer, Moore. Christian Schindelhauer

Formal languages, automata, and theory of computation

Harvard University Computer Science 121 Midterm October 23, 2012

First Midterm Examination

3 Regular expressions

Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18

Finite Automata-cont d

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

MATH 409 Advanced Calculus I Lecture 22: Improper Riemann integrals.

Compiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Transcription:

Dt Strutures nd Algorithm Xioqing Zheng zhengxq@fudn.edu.n

String mthing prolem Pttern P ours with shift s in text T (or, equivlently, tht pttern P ours eginning t position s + in text T) if T[s +... s + m] = P[... m] nd s n m. If P ours with shift s in T, the we ll s vlid shift. The string-mthing prolem is the prolem of finding ll vlid shifts with whih given pttern P ours in given text T. Text T Pttern P s =

Nottion nd terminology Σ * Σ ε x xy w w P k x x finite lphet. = {,,..., z}. set of ll finite-length strings formed using hrters from the lphet. zero-length empty string elongs to *. length of string x. ontention of two string x nd y. w is prefix of string x, if x = wy for some string y *. w is suffix of string x, if x = yw for some string y *. k-hrter prefix P[... k] of the pttern P.

Nive string-mthing lgorithm NAIVE-STRING-MATCHER(T, P ). n length[t]. m length[p]. for s to n m. do if P[... m] = T[s +... s + m]. then print "Pttern ours with shift" s Running time is O((n m + )m)

Ide of Rin-Krp lgorithm Input hrters nd string n e represented y grphil symols or digits. Given pttern P[... m], let p denote its orresponding deiml vlue. p = P[m] + (P[m ]) + (P[m ] +... + (P[] + P[])...)) We lso let t s denote the deiml vlue of the length-m sustring T[s +... s + m], for s =,,..., n m. Then, t s = p if nd only if T[s +... s + m] = P[... m].

Ide of Rin-Krp lgorithm t s+ n e omputed from t s in onstnt time, sine, t s+ = (t s m T[s + ]) + T[s + m + ]. Constnt is preomputed whih n e done in time O(lgm) For exmple, if m = nd t s =, then we remove the high-order digit T[s + ] = nd ring in the new low-order digit T[s + + ] = to otin t s+ = ( ) + =.

Improved Rin-Krp lgorithm t s+ = ((t s T[s + ]h) + T[s + m + ]) mod q. q is typilly hosen s prime suh tht q just fits within one omputer word. h m (mod q).

Improved Rin-Krp lgorithm old highorder digit 8 new loworder digit ( ) + (mod) ( ) + (mod) 8(mod)

Improved Rin-Krp lgorithm 8 9 8 9 9 9...... 8 9 8 9 mod Vlid mth Spurious hit ts p(mod q) does not imply tht t s = p.

Rin-Krp lgorithm RABIN-KARP-MATCHER(T, P, d, q ). n length[t]. m length[p]. h d m mod q. p. t. for i to m // Preproessing.. do p (dp + P[i]) mod q 8. do t (dt + T[i]) mod q 9. for s to n m // Mthing.. do if p = t s. then if P[... m] = T[s +... s + m] Preproessing Ф(m) Mthing O((n m + )m). then print "Pttern ours with shift" s. if s < n m. then t s+ (d(t s T[s + ]h) + T[s + m + ]) mod q

Finite utomton

Finite utomton A finite utomton M is -tuple (Q, q, A,, δ ) Q is finite set of sttes, q Q is the strt sttes, A Q is distinguished set of epting sttes, is finite input lphet, δ is funtion from Q into Q, lled the trnsition funtion of M.

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

Opertion of finite utomton P Stte 9 8 i T[i] δ

String mthing with finite utomt FINITE-AUTOMATON-MATCHER(T, δ, m). n length[t]. q. for i to n. do q δ(q, T[i]). if q = m. then print "Pttern ours with shift" i m Running time is Θ(n)

Computing the trnsition funtion COMPUTE-TRANSITION-FUNCTION(P, ). m length[p]. for q to m. do for eh hrter. do k min(m +, q + ). repet k k. until Pk P q. δ(q, ) k 8. return δ Running time is O(m )

Computing the trnsition funtion Step 8 9 m... q...... k... P k ε ε ε... P q δ δ(, ) = δ(, ) = δ(, ) = δ(, ) = δ(, ) = δ(, ) =...

Ide of Knuth-Morris-Prtt lgorithm T P s q T P s' = s + k

Ide of Knuth-Morris-Prtt lgorithm i 8 9 P[i] π[i] P 8 P π[8] = P π[] = P π[] = P π[] = ε

Knuth-Morris-Prtt lgorithm KMP-MATCHER(T, P). n length[t]. m length[p]. π COMPUTE-PREFIX-FUNCTION(P). q. for i to n. do while q > nd P[q + ] T[i]. do q π[q] 8. if P[q + ] = T[i] 9. then q q +. if q = m. then print "Pttern ours with shift" i m. q π[q] Running time is Θ(n)

Computing prefix funtion COMPUTE-PREFIX-FUNCTION(P). m length[p]. π[]. k. for q to m. do while k > nd P[k + ] P[q]. do k π[k]. if P[k + ] = P[q] 8. then k k + 9. π[q] k. return π Running time is Θ(m)

Computing prefix funtion Step m q k P[k +] = P[q] π π() = P[] = = P[] π() = P[] = = = P[] π() = P[] = = = P[] π() = P[] = = = P[] 8 π() = 9 P[] = = = P[] π() = P[] = = = P[] π() =

Computing prefix funtion (ont.) Step m q k P[k +] = P[q] π 8 P[] = = = P[8] π(8) = 9 P[] = = P[9] P[] = = P[9] P[] = = P[9] 8 π(9) =

String mthing lgorithms Algorithms Nive Rin-Krp Finite utomton Knuth-Morris-Prtt Preproessing time Θ(m) O(m ) Θ(m) Mthing time O((n m +)m) O((n m +)m) Θ(n) Θ(n)

Any question? Xioqing Zheng Fundn University