Dt Strutures nd Algorithm Xioqing Zheng zhengxq@fudn.edu.n
String mthing prolem Pttern P ours with shift s in text T (or, equivlently, tht pttern P ours eginning t position s + in text T) if T[s +... s + m] = P[... m] nd s n m. If P ours with shift s in T, the we ll s vlid shift. The string-mthing prolem is the prolem of finding ll vlid shifts with whih given pttern P ours in given text T. Text T Pttern P s =
Nottion nd terminology Σ * Σ ε x xy w w P k x x finite lphet. = {,,..., z}. set of ll finite-length strings formed using hrters from the lphet. zero-length empty string elongs to *. length of string x. ontention of two string x nd y. w is prefix of string x, if x = wy for some string y *. w is suffix of string x, if x = yw for some string y *. k-hrter prefix P[... k] of the pttern P.
Nive string-mthing lgorithm NAIVE-STRING-MATCHER(T, P ). n length[t]. m length[p]. for s to n m. do if P[... m] = T[s +... s + m]. then print "Pttern ours with shift" s Running time is O((n m + )m)
Ide of Rin-Krp lgorithm Input hrters nd string n e represented y grphil symols or digits. Given pttern P[... m], let p denote its orresponding deiml vlue. p = P[m] + (P[m ]) + (P[m ] +... + (P[] + P[])...)) We lso let t s denote the deiml vlue of the length-m sustring T[s +... s + m], for s =,,..., n m. Then, t s = p if nd only if T[s +... s + m] = P[... m].
Ide of Rin-Krp lgorithm t s+ n e omputed from t s in onstnt time, sine, t s+ = (t s m T[s + ]) + T[s + m + ]. Constnt is preomputed whih n e done in time O(lgm) For exmple, if m = nd t s =, then we remove the high-order digit T[s + ] = nd ring in the new low-order digit T[s + + ] = to otin t s+ = ( ) + =.
Improved Rin-Krp lgorithm t s+ = ((t s T[s + ]h) + T[s + m + ]) mod q. q is typilly hosen s prime suh tht q just fits within one omputer word. h m (mod q).
Improved Rin-Krp lgorithm old highorder digit 8 new loworder digit ( ) + (mod) ( ) + (mod) 8(mod)
Improved Rin-Krp lgorithm 8 9 8 9 9 9...... 8 9 8 9 mod Vlid mth Spurious hit ts p(mod q) does not imply tht t s = p.
Rin-Krp lgorithm RABIN-KARP-MATCHER(T, P, d, q ). n length[t]. m length[p]. h d m mod q. p. t. for i to m // Preproessing.. do p (dp + P[i]) mod q 8. do t (dt + T[i]) mod q 9. for s to n m // Mthing.. do if p = t s. then if P[... m] = T[s +... s + m] Preproessing Ф(m) Mthing O((n m + )m). then print "Pttern ours with shift" s. if s < n m. then t s+ (d(t s T[s + ]h) + T[s + m + ]) mod q
Finite utomton
Finite utomton A finite utomton M is -tuple (Q, q, A,, δ ) Q is finite set of sttes, q Q is the strt sttes, A Q is distinguished set of epting sttes, is finite input lphet, δ is funtion from Q into Q, lled the trnsition funtion of M.
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
Opertion of finite utomton P Stte 9 8 i T[i] δ
String mthing with finite utomt FINITE-AUTOMATON-MATCHER(T, δ, m). n length[t]. q. for i to n. do q δ(q, T[i]). if q = m. then print "Pttern ours with shift" i m Running time is Θ(n)
Computing the trnsition funtion COMPUTE-TRANSITION-FUNCTION(P, ). m length[p]. for q to m. do for eh hrter. do k min(m +, q + ). repet k k. until Pk P q. δ(q, ) k 8. return δ Running time is O(m )
Computing the trnsition funtion Step 8 9 m... q...... k... P k ε ε ε... P q δ δ(, ) = δ(, ) = δ(, ) = δ(, ) = δ(, ) = δ(, ) =...
Ide of Knuth-Morris-Prtt lgorithm T P s q T P s' = s + k
Ide of Knuth-Morris-Prtt lgorithm i 8 9 P[i] π[i] P 8 P π[8] = P π[] = P π[] = P π[] = ε
Knuth-Morris-Prtt lgorithm KMP-MATCHER(T, P). n length[t]. m length[p]. π COMPUTE-PREFIX-FUNCTION(P). q. for i to n. do while q > nd P[q + ] T[i]. do q π[q] 8. if P[q + ] = T[i] 9. then q q +. if q = m. then print "Pttern ours with shift" i m. q π[q] Running time is Θ(n)
Computing prefix funtion COMPUTE-PREFIX-FUNCTION(P). m length[p]. π[]. k. for q to m. do while k > nd P[k + ] P[q]. do k π[k]. if P[k + ] = P[q] 8. then k k + 9. π[q] k. return π Running time is Θ(m)
Computing prefix funtion Step m q k P[k +] = P[q] π π() = P[] = = P[] π() = P[] = = = P[] π() = P[] = = = P[] π() = P[] = = = P[] 8 π() = 9 P[] = = = P[] π() = P[] = = = P[] π() =
Computing prefix funtion (ont.) Step m q k P[k +] = P[q] π 8 P[] = = = P[8] π(8) = 9 P[] = = P[9] P[] = = P[9] P[] = = P[9] 8 π(9) =
String mthing lgorithms Algorithms Nive Rin-Krp Finite utomton Knuth-Morris-Prtt Preproessing time Θ(m) O(m ) Θ(m) Mthing time O((n m +)m) O((n m +)m) Θ(n) Θ(n)
Any question? Xioqing Zheng Fundn University