String Matching. CSE 548: (Design and) Analysis of Algorithms. Topics. Terminology

Size: px
Start display at page:

Download "String Matching. CSE 548: (Design and) Analysis of Algorithms. Topics. Terminology"

Transcription

1 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Motivtion Bckground String Mtching CSE 548: (Design nd) Anlysis of Algorithms String Algorithms R. Sekr 1 / 84 Strings provide the primry mens of interfcing to mchines. progrms, documents,... Consequently, string mtching is centrl to numerous, widely-used systems nd tools Compilers nd interpreters, commnd processors (e.g., sh), text-processing tools (sed, wk,...) Document serching nd processing, e.g., grep, Google, NLP tools,... Editors nd word-processors File versioning nd compression, e.g., rcs, svn, rsync,... Network nd system mngement, e.g., intrusion detection, performnce monitoring,... Computtionl iology, e.g., DNA lignment, muttions, evolutionry trees,... 2 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Motivtion Bckground Topics Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Motivtion Bckground Terminology 1. Intro Motivtion Bckground 2. RE Regulr expressions 3. FSA DFA nd NFA 4. To DFA RE Derivtives McNughton-Ymd 5. Trie Tries 6. grep Using Derivtives KMP Aho-Corsick Shift-And 7. grep Levenshtein Automton 8. Fing.print Rin-Krp Rolling Hshes Common Sustring nd rsync 9. Suffix trees Overview Applictions Suffix Arrys 3 / 84 String: List S[1..i] of chrcters over n lphet. Sustring: A string P[1..j] such tht for P[1..j] = S[l +1..l +j] for some l. Prefix: A sustring P of S occurring t its eginning Suffix: A sustring P of S occurring t its end Susequence: Similr to sustring, ut the the elements of P need not occur contiguously in S. For instnce, cd is sustring of cde, while de is suffix, cd is prefix, nd cd is susequence. A sustring (or prefix/suffix/susequence) T of S is sid to e proper if T S. 4 / 84

2 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Motivtion Bckground String Mtching Prolems Given pttern string p nd nother string s: Exct mtch: Is p sustring of s? Mtch with wildcrds: In this cse, the pttern cn contin wildcrd chrcters tht cn mtch ny chrcter in s Regulr expression mtch: In this cse, p is regulr expression Sustring/prefix/suffix: Does (sufficiently long) sustring/prefix/suffix of p occur in s? Approximte mtch: Is there sustring of s tht is within certin edit distnce from p? Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Motivtion Bckground String Mtching Techniques Finite-utomt nd vrints: Regexp mtching, Knuth-Morris-Prtt, Aho-Corsick Seminumericl Techniques: Shift-nd, Shift-nd with errors, Rin-Krp, Hsh-sed Suffix trees nd suffix rrys: Techniques for finding sustrings, suffixes, etc. Multi-mtch: Insted of single pttern, you re given set p1,.., pn of ptterns. Applies to ll ove prolems. 5 / 84 6 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Regulr expressions Lnguge of Regulr Expressions Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Regulr expressions Regulr Expression Nottion to represent (potentilly) infinite sets of strings over lphet. Let R e the set of ll regulr expressions over. Then, Empty String : ɛ R Unit Strings : α α R Conctention : r1, r2 R r1r2 R Alterntive : r1, r2 R (r1 r2) R Kleene Closure : r R r R : stnds for the set of strings {} : stnds for the set {, } Union of sets corresponding to REs nd : stnds for the set {} Anlogous to set product on REs for nd ( )( ): stnds for the set {,,, }. : stnds for the set {ɛ,,,,...} tht contins ll strings of zero or more s. Anlogous to closure of the product opertion. 7 / 84 8 / 84

3 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Regulr expressions Regulr Expression Exmples Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Regulr expressions Semntics of Regulr Expressions ( ) : Set of strings with zero or more s nd zero or more s: {ɛ,,,,,,,,,...} ( ) : Set of strings with zero or more s nd zero or more s such tht ll s occur efore ny : {ɛ,,,,,,,,,...} ( ) : Set of strings with zero or more s nd zero or more s: {ɛ,,,,,,,,,...} Semntic Function L: Mps regulr expressions to sets of strings. L(ɛ) = {ɛ} L(α) = {α} (α ) L(r1 r2) = L(r1) L(r2) L(r1 r2) = L(r1) L(r2) L(r ) = {ɛ} (L(r) L(r )) 9 / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees DFA nd NFA Finite Stte Automt Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees DFA nd NFA Finite Stte Automt: An Exmple Regulr expressions re used for specifiction, while FSA re used for computtion. FSAs re represented y leled directed grph. A finite set of sttes Trnsitions etween sttes (vertices). (edges). Lels on trnsitions re drwn from {ɛ}. One distinguished strt stte. One or more distinguished finl sttes. 11 / 84 Consider the Regulr Expression ( ) ( ). L(( ) ( )) = {,,,,,,,,,,,...}. The following (non-deterministic) utomton determines whether n input string elongs to L(( ) ( ): / 84

4 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees DFA nd NFA Determinism Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees DFA nd NFA Acceptnce Criterion ( ) ( ): Nondeterministic: (NFA) Deterministic: (DFA) A finite stte utomton (NFA or DFA) ccepts n input string x... if eginning from the strt stte... we cn trce some pth through the utomton... such tht the sequence of edge lels spells x... nd end in finl stte. Or, there exists pth in the grph from the strt stte to finl stte such tht the sequence of lels on the pth spells out x 13 / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees DFA nd NFA Recognition with n NFA Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees DFA nd NFA Recognition with DFA Is L(( ) ( ))? Input: Pth 1: Pth 2: Accept Pth 3: Accept Is L(( ) ( ))? 1 2 Input: Pth: Accept / / 84

5 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees DFA nd NFA NFA vs. DFA Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees DFA nd NFA NFA vs. DFA For every NFA, there is DFA tht ccepts the sme set of strings. NFA my hve trnsitions leled y ɛ. (Spontneous trnsitions) All trnsition lels in DFA elong to. For some string x, there my e mny ccepting pths in n NFA. For ll strings x, there is one unique ccepting pth in DFA. Usully, n input string cn e recognized fster with DFA. n = Size of Regulr Expression (pttern) m = Length of Input String (suject) Size of Automton Recognition time per input string NFA DFA O(n) O(2 n ) O(n m) O(m) NFAs re typiclly smller thn the corresponding DFAs. 17 / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd Converting RE to FSA Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd Converting RE to FSA NFA: Compile RE to NFA (Thompson s construction [1968]), then mtch. DFA: Compile to DFA, then mtch (A) Convert NFA to DFA (Rin-Scott construction), minimize (B) Direct construction: RE derivtives [Brzozowski 1964]. More convenient nd it more generl thn (A). (C) Direct construction of [McNughton Ymd 1960] Cn e seen s (more esily implemented) speciliztion of (B). Used in Lex nd its derivtives, i.e., most compilers use this lgorithm. NFA pproch tkes O(n) NFA construction plus O(nm) mtching, so hs worst cse O(nm) complexity. DFA pproch tkes O(2 n ) construction plus O(m)mtch, so hs worst cse O(2 n + m) complexity. So, why other with DFA? In mny prcticl pplictions, the pttern is fixed nd smll, while the suject text is very lrge. So, the O(mn) term is dominnt over O(2 n ) For mny importnt cses, DFAs re of polynomil size In mny pplictions, exponentil low-ups don t occur, e.g., compilers. 19 / / 84

6 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd Derivtive of Regulr Expressions Definition of RE Derivtive (1) The derivtive of regulr expression R w.r.t. symol x, denoted x[r] is nother regulr expression R such tht L(R) = L(xR ) Bsiclly, x[r] cptures the suffixes of those strings tht mtch R nd strt with x. Exmples [( c)] = c [( )cd] = cd [( ) cd] = ( ) cd incleps(r): A predicte tht returns true if ɛ L(R) incleps() = flse, incleps(r1 R2) = incleps(r1) incleps(r2) incleps(r1r2) = incleps(r1) incleps(r2) incleps(r ) = true Note incleps cn e computed in liner-time. c[( ) cd] = d d[( ) cd] = 21 / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd Definition of RE Derivtive (2) DFA Using Derivtives: Illustrtion [] = ɛ [] = [R1 R2] = [R1] [R2] [R ] = [R]R [R1R2] = [R1]R2 [R2] if incleps(r1) = [R1]R2 otherwise Note: L(ɛ) = {ɛ} L( ) = {} Consider R1 = ( ) ( ) [R1] = R1 ( ) = R2 [R1] = R1 [R2] = R1 ( ) ɛ = R3 [R2] = R1 ɛ = R4 [R3] = R1 ( ) ɛ = R3 [R3] = R1 ɛ = R4 [R4] = R1 ( ) = R2 [R4] = R / / 84

7 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd McNughton-Ymd Construction McNughton-Ymd: Definitions Cn e viewed s simpler wy to represent derivtives Positions in RE re numered, e.g., 0 ( 1 2 ) 3 ( 4 5 )$ 6. A derivtive is identified y its eginning position in the RE Or more generlly, derivtive is identified y set of positions first(p): Yields the set of first symols of RE denoted y pset P Determines the trnsitions out of DFA stte for P Exmple: For the RE ( 1 2 ) 3 ( 4 5 )$ 6, first({1, 2, 3}) = {, } Ech DFA stte corresponds to position set (pset) R1 {1, 2, 3} 3 R2 {1, 2, 3, 4, 5} 1 2 R3 {1, 2, 3, 4, 5, 6} 4 R4 {1, 2, 3, 6} 25 / 84 P s: Suset of P tht contin s, i.e., {p P R contins s t p} Exmple: {1, 2, 3} = {1, 3}, {1, 2, 4, 5} = {2, 5} follow(p): Yields the set of positions tht immeditely follow P. Note: follow(p) = p P follow({p}) Definition is very similr to derivtives Exmple: follow({3, 4}) = {4, 5, 6} follow({1}) = {1, 2, 3} 26 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd McNughton-Ymd Construction (2) Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd BuildMY Illustrtion on R = 0 ( 1 2 ) 3 ( 4 5 )$ 6 BuildMY (R, pset) Crete n utomton stte S leled pset Mrk this stte s finl if $ occurs in R t pset forech symol x first(pset) {$} do Cll BuildMY (R, follow(pset x)) if hsn t previously een clled Crete trnsition on x from S to the root of this suutomton DFA construction egins with the cll BuildMY (R, follow({0})). The root of the resulting utomton is mrked s strt stte. 27 / 84 Computtions Needed follow({0}) = {1, 2, 3} follow({1}) = follow({2}) = {1, 2, 3} follow({3}) = {4, 5} follow({4}) = follow({5}) = {6} {1, 2, 3} = {1, 3}, {1, 2, 3} = {2} follow({1, 3}) = {1, 2, 3, 4, 5} {1, 2, 3, 4, 5} = {1, 3, 4} {1, 2, 3, 4, 5} = {2, 5} follow({1, 3, 4}) = {1, 2, 3, 4, 5, 6} follow({2, 5}) = {1, 2, 3, 6} {1, 2, 3, 4, 5, 6} = {1, 3, 4} {1, 2, 3, 4, 5, 6} = {2, 5} {1, 2, 3, 6} = {1, 3} {1, 2, 3, 6} = {2} Resulting Automton 1 2 Stte Pset 1 {1,2,3} 2 {1,2,3,4,5} 3 {1,2,3,4,5,6} 4 {1,2,3,6} / 84

8 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd McNughton-Ymd (MY) Vs Derivtives Conceptully very similr MY tkes it longer to descrie, nd its correctness it hrder to follow. MY is lso more mechnicl, nd hence is found in most implementtions Derivtives pproch is more generl Cn support some extensions to REs, e.g., complement opertor Cn void some redundnt sttes during construction Exmple: For c c, DFA uilt y derivtive pproch hs 3 sttes, ut the one uilt y MY construction hs 4 sttes The derivtive pproch merges the two c s in the RE, ut with MY, the two c s hve different positions, nd hence opertions on them re not shred. 29 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd Avoiding Redundnt Sttes Automt uilt y MY is not optiml Automt minimiztion lgorithms cn e used to produce n optiml utomton. Derivtives pproch ssocites DFA sttes with derivtives, ut does not sy how to determine equlity mong derivtives. There is spectrum of techniques to determine RE equlity MY is the simplest: relies on syntctic identity At the other end of the spectrum, we could use complete decision procedure for RE equlity. In this cse, the derivtive pproch yields the optiml RE! In prctice we would tend to use something in the middle Trde off some power for ese/efficiency of implementtion 30 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd RE to DFA conversion: Complexity Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees RE Derivtives McNughton-Ymd RE Mtching: Summry Given DFA size cn e exponentil in the worst cse, we oviously must ccept worst-cse exponentil complexity. For the derivtives pproch, it is not immeditely ovious tht it even termintes! More ovious for McNughton-Ymd pproch, since DFA sttes correspond to position sets, of which there re only 2 n. Derivtive computtion is liner in RE size in the generl cse. So, overll complexity is O(n2 n ) Complexity cn e improved, ut the worst-cse 2 n tkes wy some of the rtionle for doing so. Insted, we focus on improving performnce in mny frequently occurring specil cses where etter complexity is chievle. 31 / 84 Regulr expression mtching is much more powerful thn mtching on plin strings (e.g., prefix, suffix, sustring, etc.) Nturl tht RE mtching lgorithms cn e used to solve plin string mtching But usully, you py for incresed power: more complex lgorithms, lrger runtimes or storge. We study the RE pproch ecuse it seems to not only do RE mtching, ut yield simpler, more efficient lgorithms for mtching plin strings. 32 / 84

9 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Tries Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Tries String Lookup Trie Exmple R0 = top tool tooth t sunk sunny 0 Prolem: Determine if s equls ny of the strings p1,..., pk. Equivlent to the question: does the RE p1 p2 pk mtch s? We cn use the derivtive pproch, except tht derivtives re very esy to compute. Or, we cn use BuildMY once gin, follow() sets re very esy to compute for this clss of regulr expressions. Results in n FSA tht is tree More commonly known s trie R1 = t[r0] = op ool ooth R2 = o[r1] = p ol oth R3 = p[r2] = ɛ R4 = o[r2] = l th R5 = l[r4] = ɛ R6 = t[r4] = h, R7 = h[r6] = ɛ R8 = [R0] = t, R9 = t[r8] = ɛ R10 = s[r0] = unk unny R11 = u[r10] = uk nny R12 = n[r11] = k ny 3 5 t 1 8 o t 2 9 p o 4 l t 6 s 10 u 11 n 12 k 13 n 14 R13 = k[r12] = ɛ h y R14 = n[r12] = y, R15 = y[r14] = ɛ / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Tries Trie Summry Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Tries Implementing Trnsitions A dt structure for efficient lookup Construction time liner in the size of keywords Serch time liner in the size of the input string Cn lso support mximl common prefix (MCP) query Cn lso e used for efficient representtion of string sets Tkes O( s ) time to check if s elongs to the set Set union/intersection re liner in size of the smller set Suliner in input size when one input trie is much lrger thn the other Cn compute set difference s well with sme complexity. 35 / 84 How to implement trnsitions? Arry: Efficient, ut uncceptle spce when is lrge Linked list: Spce-efficient, ut slow Hsh tles: Mid-wy etween the ove two options, ut noticely slower thn rrys. Collisions re concern. But customized hsh tles for this purpose cn e developed. Alterntively, since trnsition tles re sttic, we cn look for perfect hsh functions Specilized representtions: For specil cses such s exct serch, we could develop specilized lterntives tht re more efficient thn ll of the ove. 36 / 84

10 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Exct Serch Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Exct Serch Exmple Determine if pttern P[1..n] occurs within text S[1..m] Find j such tht P[1..n] = S[j..(j+n 1)] An RE mtching prolem: Does P mtch S? Note: mtches ny ritrry string (incl. ɛ) We consider p since it cn identify ll mtches A mtch cn e reported ech time finl stte is reched. In contrst, n utomton for P my not report ll mtches Consider R0 = ( 0 ) $ 8 We use McNughton-Ymd. Recll tht with this technique: Sttes re identified y position sets. A position denotes derivtive strting t tht position A position set indictes the union of REs corresponding to ech position. For instnce, position set {0, 2, 3} represents R / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Exct Serch: Complexity Positives: Mtching is very fst, tking only O(m) time. Only liner (rther thn exponentil) numer of sttes Downsides: Construction of psets for ech stte tkes up to O(n) time Thus, overll complexity of utomt construction is O(n 2 ) Cn e O(n 2 ) since ech stte my hve up to trnsitions Question: Cn we do etter? Fster construction O(n) insted of O(n 2 )? More efficient representtion for trnsitions. constnt numer of trnsitions per stte? 39 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Improving Exct Serch: Oservtions The DFA hs liner structure, with sttes 1 to n + 1: Stte i is reched on mtching the prefix P[1..i 1] The lrgest element of pset(i) is i If you re in stte i fter scnning S[k]: Let P = P[1..i 1] = S[k i + 2..k] Unwinding of : A prefix of S[k i + 2..k] cn e mtched with, with the rest mtching P[1..j 1] So, pset(i) includes every j such tht S[k i + 2..k] = P[1..j 1] = P[..i 1] S Vile mtch $ 8 Vile mtch Vile mtch Vile mtch 4 1 ( 0 ) $ / 84

11 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Improving Exct Serch: Key Ides Min Ide Rememer only the lrgest j < i in pset(i) You cn look t pset(j) for the next smller element Add filure links from stte i to j for this purpose Two positions per pset = O(n) construction time ( 0 ) $ / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Exct Serch: KMP Automton Only two positions per stte: {j, i} Two trns per stte: forwrd nd fil If the symol t oth positions is the sme, then the next stte hs the pset {j + 1, i + 1} Otherwise, the mtch t j cnnot dvnce on the symol t i. So, we use the fil link to identify the next shorter prefix tht cn dvnce: Follow fil link to stte u with pset {k, j} nd see if tht mtch cn dvnce Otherwise, follow the fil link from u nd so on. Filure link chse is mortized O(1) time, while other steps re O(1) time / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And KMP Algorithm BuildAuto(P[1..m]) j = 0 for i = 1 to m do fil[i] = j while j > 0 nd P[i] P[j] do j = fil[j] j + + KMP(P[1..m], S[1..n]) j = 0; BuildAuto(P) for i = 1 to n do while j > 0 nd T[i] P[j] do j = fil[j] j + + if j > m then return i m + 1 Simple, voids explicit representtion of sttes/trnsitions. Ech stte hs two trnsitions: norml nd filure. Norml trnsition t stte i is on P[i] Fil links re stored in n rry fil BuildAuto is like mtching pttern with itself! Algorithm is unelievly short nd simple! 43 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Multi-pttern Exct Serch Cn we extend KMP to support multiple ptterns? Yes we cn! It is clled Aho-Corsick (AC) utomton Note tht AC lgorithm ws pulished efore KMP! Tody, mny systems use AC (e.g., grep, snort), ut not KMP. KMP looks like liner utomton plus filure links. Aho-Corsick looks like trie extended with filure links. Filure links my go to non-ncestor stte Filure link computtions re similr McNughton-Ymd nd the derivtives lgorithms uild n utomton similr to AC, just s they did for KMP. One cn understnd Aho-Corsick s speciliztion of these lgorithms, s we did in the cse of KMP, Or, s generliztion of KMP 44 / 84

12 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Aho-Corsick Automton Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Aho-Corsick Exmple As with KMP, we cn think of AC s speciliztion of MY. Consider RE 1 Retin just the lrgest two numers i nd j in the pset. Use the vlue of j s trget for filure link, nd to find j in the successor stte s pset {j, i + 1} But there is n extr wrinkle: With KMP, there is one pttern; we keep two positions from it. With AC, we hve multiple ptterns, so stte s pset will contin positions from multiple ptterns. If two ptterns shre prefix, the utomton stte reched y this prefix will contin the next positions from oth ptterns. We will simply retin one one of these positions, sy, from the higher numered pttern. To void clutter in our exmple, we omit numering of positions tht will e dropped this wy. 45 / 84 ( 0 ) (t 1 o 2 p 3 $ 4 too 5 l 6 $ 7 toot 8 h 9 $ p e c n d $ e o f p g e h n i $ j oo k z l e m $ n ) To reduce clutter, positions tht occur with previously numered positions re not explicitly numered, e.g., o s in tooth (occurs with the o s in tool) Figure omits filure links tht go to strt stte. 2 o k3 o p l8 h4 l t 7 29 h t p c e d n e o k p ch e di n ej o kl z m e n 46 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Alterntive Approches for Exct Serch Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Prllel Exct Serch DFA pproch hd significnt preprocessing ( compiling ) costs, ut optimized runtime exctly m comprisons. KMP reduces compile-time 1 y shifting more work (up to 2m comprisons) to runtime. DFA sttes contin informtion out ll mtching prefixes, ut KMP sttes retin just the two longest ones. Other prefixes re essentilly eing computed t runtime y folowing fil links. Cn we rememer even less in utomton sttes? Cn we leve ll mtching prefixes to e computed t runtime? Key Ide: Mintin ll mtching prefixes t runtime. Simultneously dvnce the stte of ll these prefixes fter reding next input chrcter. So, there will only e O(m) comprisons totl. How cn we do this? Think of KMP utomton, strip off ll filure links The utomton is now liner: ech stte hs single successor On the next input symol, the prefix will either e extended y trnsitioning to the successor stte or, the symol doesn t mtch, nd this mtch is orted 1 while lso simplifying utomton structure 47 / / 84

13 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Bit-prllel Exct Serch: Shift-And Method Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Bit-prllel Exct Serch: Shift-And Method Mk Mk-1 Mk-2 Mk-3 Mk-4 Mk-5 Mk-6 Mk-7 Mk Mk-1 Mk-2 Mk-3 Mk-4 Mk-5 Mk-6 Mk-7 A serch for mtch egins on ech symol S[k] in input Think of plcing token in strt stte ech time you slide P over S If symols continue to mtch, tokens dvnce through successive sttes until they rech the finl stte. If there is mismtch, the corresponding token disppers or dies At ny time, token for mtches eginning t S[k n] through S[k] will e in the utomton. Use itvector T[0..n] to record these tokens. T[j] indictes if the token for mtch eginning t S[k j] is still live, i.e., S[k j..k 1] = P[1..j] T[n] = 1 indictes completed mtch. 49 / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Bit-prllel Exct Serch: Shift-And Method Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Shift-And Method Illustrtion T0 T1 T2 T3 Ech time k is incremented, dvnce tokens forwrd y one stte T[j] dvnces if S[k] mtches the trnsition lel Use itvector δ[0..n] to record trnsitions. δx[j] = 1 if the trnsition out of stte j is leled x In other words, δx[j] = 1 iff P[j + 1] = x Note δ = , δ = (Note tht itvector indices go from right to left, while string indices go left-to-right.) This mens tht when k is incremented, T should e updted s: T4 T = [(T&δS[k]) 1] 1 T5 T6 T7 51 / 84 k=11 S cd P T δ T new / 84

14 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Shift-And Method Illustrtion Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Shift-And Method Illustrtion k=12 S cd P T δ T new k=13 S cd P T δ T new / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Using Derivtives KMP Aho-Corsick Shift-And Shift-And Method Illustrtion Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Levenshtein Automton Approximte Serch k=14 S cd P T δ T new Approch 1: Use edit-distnce lgorithm Expensive Does not llow for multiple ptterns Unless you try the ptterns one-y-one Approch 2: Levenshtein Automton Cn e much fster, especilly when p is smll. Supports multiple ptterns Enles pplictions such s spell-correction 55 / / 84

15 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Levenshtein Automton Levenshtein Automton Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Levenshtein Automton Levenshtein Automton T00 T1 T2 T3 T4 T5 T6 T7 U0 T0 U1 T1 U2 T2 U3 T3 U4 T4 U5 T5 U6 T6 U7 T7 No errors permitted. Up to one missing chrcter (deletion). 57 / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Levenshtein Automton Levenshtein Automton Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Levenshtein Automton Levenshtein Automton U0 T0 U1 T1 U2 T2 U3 T3 U4 T4 U5 T5 U6 T6 U7 T7 U0 T0 U1 T1 U2 T2 U3 T3 U4 T4 U5 T5 U6 T6 U7 T7 Up to one deletion nd one insertion. Up to one deletion, or insertion, or sustitution 59 / / 84

16 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Levenshtein Automton Levenshtein Automton Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Levenshtein Automton Levenshtein Automton V0 U0 T0 V1 U1 T1 V2 U2 T2 V3 U3 T3 V4 U4 T4 V5 U5 T5 V6 U6 T6 V7 U7 T7 V0 U0 T0 V1 U1 T1 V2 U2 T2 V3 U3 T3 V4 U4 T4 V5 U5 T5 V6 U6 Compre with: Structure of cost mtrix for edit-distnce prolem T6 V7 U7 T7 Up to totl of two deletions, insertions, or sustitution 61 / 84 Finding lest-cost pths from T0 to T7, U7 or V7 Illustrtes the reltionship etween shortest pth nd edit-distnce prolem 62 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Levenshtein Automton Mtching Using Levenshtein Automton Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Levenshtein Automton Mtching Using Levenshtein Automton V0 V1 V2 U0 U1 U2 T0 T1 T2 V3 V4 U3 U4 T3 T4 Convert to DFA (suset construction) V5 V6 V7 U5 U6 U7 T5 T6 T7 Potentilly O(n k ) sttes, where k is the mx edit distnce permitted Adpt Shift-nd lgorithm We lredy know how to mintin T[0..n] Need to extend to compute U from T, V from U nd so on. 63 / 84 U0 U1 U2 U3 T0 T1 T2 T3 T4 T5 T6 T7 We extend the nottion to explicitly include of the current position k in T. With this extension, our originl eqution for T ecomes U4 U5 T k = 1 [(T k 1 &δ S[k]) 1] Extending this to the cse of U, we hve U k = T k 1 // move, i.e., Insertion of S[k] T k 1 1 // move with sustitution T k 1 // move with deletion [(U k 1 &δ S[k]) 1] 1 // move U6 U7 64 / 84

17 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Levenshtein Automton Levenshtein utomton nd spell-correction When word w is misspelled, we wnt to find the closest mtching word in the dictionry Or, list ll mtches within n edit distnce of l Approch: Build Levenshtein utomton for w with l + 1 lyers Run the dictionry trie through the utomton List ll mtches Alterntively, DFA for the Levenshtein utomton could e uilt, nd the trie run through this DFA. The DFA could e directly constructed s well, without going through n NFA nd powerset construction. 65 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Rin-Krp Rolling Hshes Common Sustring nd rsync Using rithmetic for exct mtching Prolem: Given strings P[1..n] nd T[1..m], find occurrences of P in T in O(n + m) time. Ide: To simplify presenttion, ssume P, T rnge over [0-9] Interpret P[1..n] s digits of numer p = 10 n 1 P[1] + 10 n 2 P[2] + 10 n n P[n] Similrly, interpret T[i..(i + n 1)] s the numer ti Note: P is sustring of T t i iff p = ti To get ti+1, shift T[i] out of ti, nd shift in T[i + m]: ti+1 = (ti 10 n 1 T[i]) 10 + T[i + n] We hve n O(n + m) lgorithm. Almost: we still need to figure out how to operte on n-digit numers in constnt time! 66 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Rin-Krp Rolling Hshes Common Sustring nd rsync Rin-Krp Fingerprinting Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Rin-Krp Rolling Hshes Common Sustring nd rsync Crter-Wegmn-Rin-Krp Algorithm Key Ide Insted of working with n-digit numers, perform ll rithmetic modulo rndom prime numer q, where q > n 2 fits within wordsize All oservtions mde on previous slide still hold Except tht p = ti does not gurntee mtch Typiclly, we expect mtches to e infrequent, so we cn use O(n) exct-mtching lgorithm to confirm prole mtches. Difficulty with Rin-Krp: Need to generte rndom primes (not esy). New Ide: Mke the rdix rndom, s opposed to the modulus We still compute modulo prime q, ut it is not rndom. Alterntive interprettion: We tret P s polynomil n p(x) = P[n i] x i i=1 nd evlute this polynomil t rndomly chosen vlue of x Wht is the likelihood of flse mtches? Note tht flse mtch occurs when p(x) = ti(x), or when p(x) ti(x) = 0. Arithmetic modulo prime defines field, so n (n 1)th degree polynomil hs n 1 roots i.e., (n 1)/q of the q possile choices of x result in flse mtch. 67 / / 84

18 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Rin-Krp Rolling Hshes Common Sustring nd rsync Rolling Hshes RK nd CWRK re exmples of rolling hshes Hsh computed on text within sliding window Key point: Incrementl computtion of hsh s the window slides. Polynomil-sed hshes re esy to compute incrementlly: ti+1 = (ti x n 1 T[i]) x + T[i + n] Complexity: x n 1 is fixed once the window size is chosen Tkes just two multiplictions, one modulo per symol O(m + n) multipliction/modulo opertions in totl Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Rin-Krp Rolling Hshes Common Sustring nd rsync Other Rolling Hshes In some contexts, multipliction/modulo my e too expensive. Alterntives: Use shifts, cyclic shifts, sustitution mps nd xor opertions, voiding multiplictions ltogether Need considerle reserch to find good fingerprinting functions. Exmple: Adler32 used in zli (used everywhere) nd rsync. l 1 Al = 1 + ti+k mod k=0 n n 1 B = Ak = n + (n k)ti+k mod k=1 k=0 H = (B 16) + A 69 / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Rin-Krp Rolling Hshes Common Sustring nd rsync Rolling Hsh nd Common Sustring Prolem Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Rin-Krp Rolling Hshes Common Sustring nd rsync zli/gzip, rsync, inry diff, etc. To find common sustring of length l or more Compute rolling hshes of P nd T with window size l Tkes O(n + m) time. O(nm) comprisons, so expected numer of collisions increses. Unless collision proility is O(1/nm), expected runtime cn e nonliner Cn find longest common sustring (LCS) using inry-serch like process, with totl complexity of O((n + m) log(n + m)) 71 / 84 rsync: Synchronizes directories cross network Need to minimize dt trnsferred A diff requires entire files to e copied to client side first! Uses timestmps (or whole-file checksums) to detect unchnged files For modified files, uses Adler-32 to identify modified regions Find common sustrings of certin length, sy, 128-ytes Relies on stronger MD-5 hsh to verify unmodified regions gzip: Uses rolling hsh (Adler-32) to identify text tht repeted from previous 32KB window Repeting text cn e replced with pointer: (offset, length). Binry diff: Mny progrms such s xdelt nd svn need to perform diffs on inries; they too rely on rolling hshes. diff depends criticlly on line reks, so does poorly on inries 72 / 84

19 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys Suffix Trees [Weiner 1973] Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys Suffix Tree Exmple A verstile dt structure with wide pplictions in string serch nd computtionl iology Key Property Behind Suffix Trees Sustrings re prefixes of suffixes Compressed trie of ll suffixes of string ppended with $ Liner chins in the trie re compressed Edges cn now e sustrings. Ech stte hs t lest two children. Filure links used only during construction Leves identify strting position of tht suffix. Uses end-mrker $ Key point: Cn e constructed in liner time! Supports suliner exct mtch queries, nd liner LCS queries With liner-time preprocessing on the text (to uild suffix tree), yields etter runtime thn techniques discussed so fr. Applicle to single s well s multiple ptterns or texts! 73 / 84 Leves identify strting position of suffix Typiclly, it is the text we preprocess, not the pttern. Imges from Wikipedi commons 74 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys Finding Sustrings nd Suffixes Counting # of Occurrences of p Is p sustring of t? Exmple: Is nn sustring of nn? Solution: Follow pth leled p from root of suffix tree for t. If you fil long the wy, then no, else yes p is suffix if you rech lef t the end of p O( p ) time, independent of t gret for lrge t How mny times does n occur in t? Solution: Follow pth leled p from root of suffix tree for t. Count the numer of leves elow. O( p ) time if dditionl informtion (# of leves elow) mintined t internl nodes. 75 / / 84

20 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys Self-LCS (Or, Longest Common Repet) Wht is the longest sustring tht repets in t? Solution: Find the deepest non-lef node with two or more children! In our exmple, it is n. Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys LC extension of i nd j Longest Common Extension Longest common prefix of suffixes strting t i nd j Locte leves leled i nd j. Find their lest common ncestor (LCA) The string spelled out y the pth from root to this LCA is wht we wnt. 77 / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys LCS with nother string p We cn use the sme procedure s LCR, if suffixes of p were lso included in the suffix tree Leds to the notion of generlized suffix tree Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys Generlized Suffix Trees Suffix trees for multiple strings p1,..., pn 79 / 84 Imges from UMD CMSC 423 Slides 80 / 84

21 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys Generlized Suffix Tree: Applictions Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys Suffix Arrys [Mner nd Myers 1989] LCS of p nd t: Build GST for s nd t, find deepest node tht hs descendnts corresponding to s nd t LCS of p1,..., pk: Build GST for p1 to pk, find deepest node tht hs descendnts from ll of p1,..., pn Find strings in dtse contining q: Build suffix tree of ll strings in the dtse follow pth tht spells q q occurs in every pi tht ppers elow this node. Drwcks of suffix trees: Multiple pointers per internl node: significnt storge costs Pointer-chsing is not cche-friendly Suffix rrys ddress these drwcks. Requires sme symptotic storge (O(n)) ut constnt fctors lot smller 4x or so. Insted of nvigting down pth in the tree, relies on inry serch Increses symptotic cost y O(log n), ut cn e fster in prctice due to etter cche performnce etc. 81 / / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys Suffix Arrys Construct sorted rry of suffixes, rther thn tries Cn use 2 to 4 ytes per symol Use inry serch to locte suffixes etc. i Ti Ai TAi 1 mississippi$ 12 $ 2 ississippi$ 11 i$ 3 ssissippi$ 8 ippi$ 4 sissippi$ 5 issippi$ 5 issippi$ 2 ississippi$ 6 ssippi$ 1 mississippi$ 7 sippi$ 10 pi$ 8 ippi$ 9 ppi$ 9 ppi$ 7 sippi$ 10 pi$ 4 sissippi$ 11 i$ 6 ssippi$ 12 $ 3 ssissippi$ 83 / 84 Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Overview Applictions Suffix Arrys Finding Suffix Arrys Mintining LCP of successive suffixes speeds up lgorithms Serch for sustring p in O( p + log t ) Count numer of occurrences of p in O( p + log t ) time Serch for longest common repet O( t ) time Use inry serch to locte suffixes etc. i Ti Ai TAi LCP 1 mississippi$ 12 $ 2 ississippi$ 11 i$ 0 3 ssissippi$ 8 ippi$ 1 4 sissippi$ 5 issippi$ 1 5 issippi$ 2 ississippi$ 4 6 ssippi$ 1 mississippi$ 0 7 sippi$ 10 pi$ 0 8 ippi$ 9 ppi$ 1 9 ppi$ 7 sippi$ 0 10 pi$ 4 sissippi$ 2 11 i$ 6 ssippi$ 1 12 $ 3 ssissippi$ 3 84 / 84

CSE 548: (Design and) Analysis of Algorithms

CSE 548: (Design and) Analysis of Algorithms Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees 1 / 84 CSE 548: (Design nd) Anlysis of Algorithms String Algorithms R. Sekr Intro RE FSA To DFA Trie grep grep Fing.print Suffix trees Motivtion

More information

Module 9: Tries and String Matching

Module 9: Tries and String Matching Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer

More information

Module 9: Tries and String Matching

Module 9: Tries and String Matching Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

Fingerprint idea. Assume:

Fingerprint idea. Assume: Fingerprint ide Assume: We cn compute fingerprint f(p) of P in O(m) time. If f(p) f(t[s.. s+m 1]), then P T[s.. s+m 1] We cn compre fingerprints in O(1) We cn compute f = f(t[s+1.. s+m]) from f(t[s.. s+m

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont. NFA DFA Exmple 3 CMSC 330: Orgniztion of Progrmming Lnguges NFA {B,D,E {A,E {C,D {E Finite Automt, con't. R = { {A,E, {B,D,E, {C,D, {E 2 Equivlence of DFAs nd NFAs Any string from {A to either {D or {CD

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016 CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

Java II Finite Automata I

Java II Finite Automata I Jv II Finite Automt I Bernd Kiefer Bernd.Kiefer@dfki.de Deutsches Forschungszentrum für künstliche Intelligenz Finite Automt I p.1/13 Processing Regulr Expressions We lredy lerned out Jv s regulr expression

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton 25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q

More information

Where did dynamic programming come from?

Where did dynamic programming come from? Where did dynmic progrmming come from? String lgorithms Dvid Kuchk cs302 Spring 2012 Richrd ellmn On the irth of Dynmic Progrmming Sturt Dreyfus http://www.eng.tu.c.il/~mi/cd/ or50/1526-5463-2002-50-01-0048.pdf

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014 CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz University of Southern Cliforni Computer Science Deprtment Compiler Design Fll Lexicl Anlysis Smple Exercises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sciences Institute 4676 Admirlty Wy, Suite

More information

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014 CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA

More information

CISC 4090 Theory of Computation

CISC 4090 Theory of Computation 9/6/28 Stereotypicl computer CISC 49 Theory of Computtion Finite stte mchines & Regulr lnguges Professor Dniel Leeds dleeds@fordhm.edu JMH 332 Centrl processing unit (CPU) performs ll the instructions

More information

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-* Regulr Expressions (RE) Regulr Expressions (RE) Empty set F A RE denotes the empty set Opertion Nottion Lnguge UNIX Empty string A RE denotes the set {} Alterntion R +r L(r ) L(r ) r r Symol Alterntion

More information

Finite Automata-cont d

Finite Automata-cont d Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww

More information

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

Lexical Analysis Finite Automate

Lexical Analysis Finite Automate Lexicl Anlysis Finite Automte CMPSC 470 Lecture 04 Topics: Deterministic Finite Automt (DFA) Nondeterministic Finite Automt (NFA) Regulr Expression NFA DFA A. Finite Automt (FA) FA re grph, like trnsition

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018 CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA

More information

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

Converting Regular Expressions to Discrete Finite Automata: A Tutorial Converting Regulr Expressions to Discrete Finite Automt: A Tutoril Dvid Christinsen 2013-01-03 This is tutoril on how to convert regulr expressions to nondeterministic finite utomt (NFA) nd how to convert

More information

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy:

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy: Scnner Specifying ptterns source code tokens scnner prser IR A scnner must recognize the units of syntx Some prts re esy: errors mps chrcters into tokens the sic unit of syntx x = x + y; ecomes

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh Lnguges nd Automt Finite Automt Informtics 2A: Lecture 3 John Longley School of Informtics University of Edinburgh jrl@inf.ed.c.uk 22 September 2017 1 / 30 Lnguges nd Automt 1 Lnguges nd Automt Wht is

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science CSCI 340: Computtionl Models Kleene s Theorem Chpter 7 Deprtment of Computer Science Unifiction In 1954, Kleene presented (nd proved) theorem which (in our version) sttes tht if lnguge cn e defined y ny

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck. Outline Automt Theory 101 Rlf Huuck Introduction Finite Automt Regulr Expressions ω-automt Session 1 2006 Rlf Huuck 1 Session 1 2006 Rlf Huuck 2 Acknowledgement Some slides re sed on Wolfgng Thoms excellent

More information

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38 Theory of Computtion Regulr Lnguges (NTU EE) Regulr Lnguges Fll 2017 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of Finite Automt A finite utomton hs finite set of control

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

Balanced binary search trees

Balanced binary search trees 02110 Inge Li Gørtz Overview Blnced binry serch trees: Red-blck trees nd 2-3-4 trees Amortized nlysis Dynmic progrmming Network flows String mtching String indexing Computtionl geometry Introduction to

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

1 From NFA to regular expression

1 From NFA to regular expression Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work

More information

1.3 Regular Expressions

1.3 Regular Expressions 56 1.3 Regulr xpressions These hve n importnt role in describing ptterns in serching for strings in mny pplictions (e.g. wk, grep, Perl,...) All regulr expressions of lphbet re 1.Ønd re regulr expressions,

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More

More information

Lexical Analysis Part III

Lexical Analysis Part III Lexicl Anlysis Prt III Chpter 3: Finite Automt Slides dpted from : Roert vn Engelen, Florid Stte University Alex Aiken, Stnford University Design of Lexicl Anlyzer Genertor Trnslte regulr expressions to

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information

Theory of Computation Regular Languages

Theory of Computation Regular Languages Theory of Computtion Regulr Lnguges Bow-Yw Wng Acdemi Sinic Spring 2012 Bow-Yw Wng (Acdemi Sinic) Regulr Lnguges Spring 2012 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of

More information

First Midterm Examination

First Midterm Examination Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does

More information

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun: CMPU 240 Lnguge Theory nd Computtion Spring 2019 NFAs nd Regulr Expressions Lst clss: Introduced nondeterministic finite utomt with -trnsitions Tody: Prove n NFA- is no more powerful thn n NFA Introduce

More information

Finite-State Automata: Recap

Finite-State Automata: Recap Finite-Stte Automt: Recp Deepk D Souz Deprtment of Computer Science nd Automtion Indin Institute of Science, Bnglore. 09 August 2016 Outline 1 Introduction 2 Forml Definitions nd Nottion 3 Closure under

More information

Deterministic Finite Automata

Deterministic Finite Automata Finite Automt Deterministic Finite Automt H. Geuvers nd J. Rot Institute for Computing nd Informtion Sciences Version: fll 2016 J. Rot Version: fll 2016 Tlen en Automten 1 / 21 Outline Finite Automt Finite

More information

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh Finite Automt Informtics 2A: Lecture 3 Mry Cryn School of Informtics University of Edinburgh mcryn@inf.ed.c.uk 21 September 2018 1 / 30 Lnguges nd Automt Wht is lnguge? Finite utomt: recp Some forml definitions

More information

NFAs continued, Closure Properties of Regular Languages

NFAs continued, Closure Properties of Regular Languages Algorithms & Models of Computtion CS/ECE 374, Fll 2017 NFAs continued, Closure Properties of Regulr Lnguges Lecture 5 Tuesdy, Septemer 12, 2017 Sriel Hr-Peled (UIUC) CS374 1 Fll 2017 1 / 31 Regulr Lnguges,

More information

Chapter 2 Finite Automata

Chapter 2 Finite Automata Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht

More information

CHAPTER 1 Regular Languages. Contents

CHAPTER 1 Regular Languages. Contents Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr

More information

CSCI 340: Computational Models. Transition Graphs. Department of Computer Science

CSCI 340: Computational Models. Transition Graphs. Department of Computer Science CSCI 340: Computtionl Models Trnsition Grphs Chpter 6 Deprtment of Computer Science Relxing Restrints on Inputs We cn uild n FA tht ccepts only the word! 5 sttes ecuse n FA cn only process one letter t

More information

Regular Languages and Applications

Regular Languages and Applications Regulr Lnguges nd Applictions Yo-Su Hn Deprtment of Computer Science Yonsei University 1-1 SNU 4/14 Regulr Lnguges An old nd well-known topic in CS Kleene Theorem in 1959 FA (finite-stte utomton) constructions:

More information

1 APL13: Suffix Arrays: more space reduction

1 APL13: Suffix Arrays: more space reduction 1 APL13: Suffix Arrys: more spce reduction In Section??, we sw tht when lphbet size is included in the time nd spce bounds, the suffix tree for string of length m either requires Θ(m Σ ) spce or the minimum

More information

Harvard University Computer Science 121 Midterm October 23, 2012

Harvard University Computer Science 121 Midterm October 23, 2012 Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

Fundamentals of Computer Science

Fundamentals of Computer Science Fundmentls of Computer Science Chpter 3: NFA nd DFA equivlence Regulr expressions Henrik Björklund Umeå University Jnury 23, 2014 NFA nd DFA equivlence As we shll see, it turns out tht NFA nd DFA re equivlent,

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!) CMSC 330: Orgniztion of Progrmming Lnguges DFAs, nd NFAs, nd Regexps (Oh my!) CMSC330 Spring 2018 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All

More information

Some Theory of Computation Exercises Week 1

Some Theory of Computation Exercises Week 1 Some Theory of Computtion Exercises Week 1 Section 1 Deterministic Finite Automt Question 1.3 d d d d u q 1 q 2 q 3 q 4 q 5 d u u u u Question 1.4 Prt c - {w w hs even s nd one or two s} First we sk whether

More information

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA) Finite Automt (FA or DFA) CHAPTER Regulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, equivlence of NFAs DFAs, closure under regulr

More information

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS The University of Nottinghm SCHOOL OF COMPUTER SCIENCE LEVEL 2 MODULE, SPRING SEMESTER 2016 2017 LNGUGES ND COMPUTTION NSWERS Time llowed TWO hours Cndidtes my complete the front cover of their nswer ook

More information

Worked out examples Finite Automata

Worked out examples Finite Automata Worked out exmples Finite Automt Exmple Design Finite Stte Automton which reds inry string nd ccepts only those tht end with. Since we re in the topic of Non Deterministic Finite Automt (NFA), we will

More information

Tutorial Automata and formal Languages

Tutorial Automata and formal Languages Tutoril Automt nd forml Lnguges Notes for to the tutoril in the summer term 2017 Sestin Küpper, Christine Mik 8. August 2017 1 Introduction: Nottions nd sic Definitions At the eginning of the tutoril we

More information

Review of Gaussian Quadrature method

Review of Gaussian Quadrature method Review of Gussin Qudrture method Nsser M. Asi Spring 006 compiled on Sundy Decemer 1, 017 t 09:1 PM 1 The prolem To find numericl vlue for the integrl of rel vlued function of rel vrile over specific rnge

More information

ɛ-closure, Kleene s Theorem,

ɛ-closure, Kleene s Theorem, DEGefW5wiGH2XgYMEzUKjEmtCDUsRQ4d 1 A nice pper relevnt to this course is titled The Glory of the Pst 2 NICTA Resercher, Adjunct t the Austrlin Ntionl University nd Griffith University ɛ-closure, Kleene

More information

State Minimization for DFAs

State Minimization for DFAs Stte Minimiztion for DFAs Red K & S 2.7 Do Homework 10. Consider: Stte Minimiztion 4 5 Is this miniml mchine? Step (1): Get rid of unrechle sttes. Stte Minimiztion 6, Stte is unrechle. Step (2): Get rid

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

Model Reduction of Finite State Machines by Contraction

Model Reduction of Finite State Machines by Contraction Model Reduction of Finite Stte Mchines y Contrction Alessndro Giu Dip. di Ingegneri Elettric ed Elettronic, Università di Cgliri, Pizz d Armi, 09123 Cgliri, Itly Phone: +39-070-675-5892 Fx: +39-070-675-5900

More information

The size of subsequence automaton

The size of subsequence automaton Theoreticl Computer Science 4 (005) 79 84 www.elsevier.com/locte/tcs Note The size of susequence utomton Zdeněk Troníček,, Ayumi Shinohr,c Deprtment of Computer Science nd Engineering, FEE CTU in Prgue,

More information

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers 80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES 2.6 Finite Stte Automt With Output: Trnsducers So fr, we hve only considered utomt tht recognize lnguges, i.e., utomt tht do not produce ny output on ny input

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.6.: Push Down Automt Remrk: This mteril is no longer tught nd not directly exm relevnt Anton Setzer (Bsed

More information

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2016 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 9 1. (4pts) ((p q) (q r)) (p r), prove tutology using truth tles. p

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

Formal Language and Automata Theory (CS21004)

Formal Language and Automata Theory (CS21004) Forml Lnguge nd Automt Forml Lnguge nd Automt Theory (CS21004) Khrgpur Khrgpur Khrgpur Forml Lnguge nd Automt Tle of Contents Forml Lnguge nd Automt Khrgpur 1 2 3 Khrgpur Forml Lnguge nd Automt Forml Lnguge

More information

Name Ima Sample ASU ID

Name Ima Sample ASU ID Nme Im Smple ASU ID 2468024680 CSE 355 Test 1, Fll 2016 30 Septemer 2016, 8:35-9:25.m., LSA 191 Regrding of Midterms If you elieve tht your grde hs not een dded up correctly, return the entire pper to

More information

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers Speech Recognition Lecture 2: Finite Automt nd Finite-Stte Trnsducers Eugene Weinstein Google, NYU Cournt Institute eugenew@cs.nyu.edu Slide Credit: Mehryr Mohri Preliminries Finite lphet, empty string.

More information

19 Optimal behavior: Game theory

19 Optimal behavior: Game theory Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,

More information

More on automata. Michael George. March 24 April 7, 2014

More on automata. Michael George. March 24 April 7, 2014 More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose

More information

NFAs continued, Closure Properties of Regular Languages

NFAs continued, Closure Properties of Regular Languages lgorithms & Models of omputtion S/EE 374, Spring 209 NFs continued, losure Properties of Regulr Lnguges Lecture 5 Tuesdy, Jnury 29, 209 Regulr Lnguges, DFs, NFs Lnguges ccepted y DFs, NFs, nd regulr expressions

More information

Lecture 3: Equivalence Relations

Lecture 3: Equivalence Relations Mthcmp Crsh Course Instructor: Pdric Brtlett Lecture 3: Equivlence Reltions Week 1 Mthcmp 2014 In our lst three tlks of this clss, we shift the focus of our tlks from proof techniques to proof concepts

More information

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages 5//6 Grmmr Automt nd Lnguges Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Prof. Mohmed Hmd Softwre Engineering L. The University of Aizu Jpn Regulr Lnguges Context Free Lnguges Context Sensitive

More information

Context-Free Grammars and Languages

Context-Free Grammars and Languages Context-Free Grmmrs nd Lnguges (Bsed on Hopcroft, Motwni nd Ullmn (2007) & Cohen (1997)) Introduction Consider n exmple sentence: A smll ct ets the fish English grmmr hs rules for constructing sentences;

More information

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3 2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is

More information

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS CS 310 (sec 20) - Winter 2003 - Finl Exm (solutions) SOLUTIONS 1. (Logic) Use truth tles to prove the following logicl equivlences: () p q (p p) (q q) () p q (p q) (p q) () p q p q p p q q (q q) (p p)

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

Alignment of Long Sequences. BMI/CS Spring 2016 Anthony Gitter

Alignment of Long Sequences. BMI/CS Spring 2016 Anthony Gitter Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostt.wisc.edu Gols for Lecture Key concepts how lrge-scle lignment differs from the simple cse the

More information

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary Outline Genetic Progrmming Evolutionry strtegies Genetic progrmming Summry Bsed on the mteril provided y Professor Michel Negnevitsky Evolutionry Strtegies An pproch simulting nturl evolution ws proposed

More information

Thoery of Automata CS402

Thoery of Automata CS402 Thoery of Automt C402 Theory of Automt Tle of contents: Lecture N0. 1... 4 ummry... 4 Wht does utomt men?... 4 Introduction to lnguges... 4 Alphets... 4 trings... 4 Defining Lnguges... 5 Lecture N0. 2...

More information