What s Behind BLAST. Gene Myers, Director MPI for Cell Biology and Genetics Dresden, DE
|
|
- Kevin Goodwin
- 5 years ago
- Views:
Transcription
1 Wht s Behind BLAST Gene Myers, Director MPI for Cell Biology nd Genetics Dresden, DE
2 Approximte String Serch Given string A of length n, query Q of length p n, n lignment scoring function δ, nd threshold d: Find ll sustrings of A, sy M, s.t. δ(q,m) d? δ here = Simple Levenstein (unit cost mismtch, insert, & delete)...xxxxxxxxcgt-gcttcxxxxxxxx...! tgtggc-ttc A 3-mtch (solute) A 25%-mtch (reltive)
3 Edit Grphs Dynmic Progrmming Mtrix
4 0ttcggtgt Alignments A 0 cgtgctt N Corresponds to pth in the edit grph of the two sequences. B cgtg-ctt tgtggc-tt M
5 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions
6 The Beginning Workshop for Algorithms in Moleculr Genetics Mrch 26-28, 1988 S. Altschul W. Fitch Z. Glil W. God T. Hunkpillr S. Krlin G. Lndu E. Lnder D. Lipmn J. Misel H. Mrtinez C. Snders T. Smith R. Stden J. Turner M. Zuker A. Mukherjee M. Wtermn D. Snkoff P. Sellers E. Ukkonen W. Miller G. Myers
7 Glil s 2 Questions Workshop for Algorithms in Moleculr Genetics Mrch 26-28, 1988 Zvi gve tlk out suffix trees: Q1: Cn one get rid of the nnoying dependence on lphet size Σ?! Mner & Myers, Suffix Arrys 1990 Q2: Cn one use n index to get fster pproximte serch?
8 Suffix Arrys Given suject string of size n over lphet of size Σ, uild n index tht determines if query string of length p occurs in the suject efficiently Suffix tree: Index is O(nΣ) spce, then O(p) time Index is O(n) spce, then O(plogΣ) time Glil: Remove nnoying dependence on Σ. Mner & Myers: Suffix rry: Index is O(n) spce, O(p + logn/logσ) time. Glil sys we misunderstood his chllenge, sigh. But suffix trees enled Burroughs-Wheeler Trnsform tht re sprse index commonly in use tody for NGS.
9 A Simple Index Φ( ccgt ) = = 283 (10) [0, Σ k -1] for ny fixed k-mer size. 0 1 Idx Pos Scn 1: Count how ig ech set Occ(c) will e (in Idx[c+1]), then set Idx[c] += Idx[c-1] to point to proper Pos index.! Scn 2: Fill in ech set using Idx[c] s finger to plce the next position, then redjust indices (Idx[c] = Idx[c-1]). c c+1 Occurences of k-mers with code c, Occ(c) = { p : Φ( A[p..p+k-1] ) = c } = { Pos[j] : j [Idx[c],Idx[c+1]-1] } Σ k -1 Σ k n-1 n Performnce: O(n+ Σ k ) time nd exctly n+ Σ k integers. If choose k ~ log Σ n then O(n). O(p+h) expected-time to find ny string of length p with h hits.
10 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions
11 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend
12 APM Filters A filter is n lgorithm tht elimintes lot of tht which isn t desired. 100% Sp Exct 100% Sn < 100% Sn < 100% Sp Filter Heuristic Filter Exct If fst & specific then cn improve speed of n exct lgorithm. Approximte mtch filter ides:! Look for exct mtches to k-mers of the query (in n index) ( Person & Lipmn FASTA, Chng & Lwler, O(dn/lg p) )! Insted look for k-mers tht re smll distnce wy, e.g. 1 or 2 diff s, from k-mer of the query, i.e. the neighorhood N d (w) = { v : v nd w re d differences prt } N d (k) ( d k )(2Σ) d
13 The Power of Neighorhoods Consider looking for 9%-mtch of 40 symols ( 3 differences or 3-mtch): If divide query into 4 10-mers then t lest one must mtch exctly:! Get hit every Σ 10 / 4 symols (e.g for DNA) If divide into 2 20-mers then t lest one of the N 1 strings must mtch exctly:! Get hit every Σ 20 / 2N 1 (20) symols (e.g / = for DNA)! 10,000 times more specific! (ut 80x more lookups)
14 Seed & Extend The seed mtches (either exct or from neighorhood) re in effect defining res within the edit grph of Q vs A where the lignment of n ε-mtch could e: A 2 4 s1 Q s2 s3 s4 Q = s1s2s3s4
15 s1 Seed & Extend The seed mtches (either exct or from neighorhood) re in effect defining res within the edit grph of Q vs A where the lignment of n ε-mtch could e: A 2 4 Q s2 s3 s4 Q = s1s2s3s4 ±d/4
16 s1 Seed & Extend The seed mtches (either exct or from neighorhood) re in effect defining res within the edit grph of Q vs A where the lignment of n ε-mtch could e: A 2 4 Q s2 s3 s4 Spend O(pdh + pz) time where! h(k) = the numer of seed k-hits vs. z(k) = neighorhood size k-words! Both z nd h re functions of k nd the optiml k is slightly igger thn logσ n ±d
17 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend My 89: The TRW Chip & The Cigrette Brek
18 The 1st Converstion X
19 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend My 89: The TRW Chip & The Cigrette Brek Fll 89: Blst is Born
20 Blst = Seed & Extend Seeds re neighorhoods of ll k-mers of query under weighted Levenstein (e.g. PAM120) Find seeds with deterministic finite utomton ccepting ll neighorhood words ( O(n)) Extend is just weighted Hmming ut stop when score drops too much A heuristic lst ws inspired y slm = suliner pproximte mtch
21 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend My 89: The TRW Chip & The Cigrette Brek Fll 89: Blst is Born!
22 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend My 89: The TRW Chip & The Cigrette Brek Fll 89: Blst is Born Fll 89: The Splitting Lemm!
23 The Splitting Lemm Lemm: If w ε-mtches v then either () w0 hs n ε-mtch to prefix (cll it v0) of v, or () w1 hs n ε-mtch to suffix (cll it v1) of v. Proof: w0 w k errors v w1 k = ε w k/2 errors? w0 v k/2 errors? w1 By Pigeon Hole Principle k/2 errors v1
24 w0 w1 k/2 errors w v1 w0 w1 w10 w11 k/4? v10 k/4? w0 w w1 w10 w11 w100 w101 k/8 v101
25 The Splitting Lemm w w0 Let w ε = w w β = w β [1.. wβ /2] if = 0 w β [ wβ /2+1.. wβ ] if = 1 e.g. α= w1 w10 w100 w101 k/8 v101 w11 Lemm: If w ε-mtches v then α s.t. prefixes β of α, (1) w β hs n ε-mtch to sustring (cll it vβ) of v, nd (2) vβ0 is prefix of vβ (if β0 is prefix of α), nd (3) vβ1 is suffix of vβ (if β1 is prefix of α).
26 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend My 89: The TRW Chip & The Cigrette Brek Fll 89: Blst is Born Fll 89: The Splitting Lemm!
27 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend My 89: The TRW Chip & The Cigrette Brek Fll 89: Blst is Born Fll 89: The Splitting Lemm Fll 89: Seed & Extend y Douling
28 Use logσ n s the seed size! Douling Extension Lemm: Any ε-mtch of Q hs n ε-mtch to t lest one seed segment of size logσ n Use the splitting lemm to split Q to seeds of size logσ n, nd insted of extending ll t once, extend y douling using the splitting lemm. Time for ech extension telescopes hyper-geometriclly nd so is dominted y the first term: O(p/logΣn h logσn εlogσn) = O(dhlogΣn)
29 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend My 89: The TRW Chip & The Cigrette Brek Fll 89: Blst is Born Fll 89: The Splitting Lemm Fll 89: Seed & Extend y Douling
30 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend My 89: The TRW Chip & The Cigrette Brek Fll 89: Blst is Born Fll 89: The Splitting Lemm Fll 89: Seed & Extend y Douling Spr 90: Generting Condensed Neighorhoods
31 Generting (Condensed) Neighorhoods N d (w) = { v : v nd w re d differences prt nd v is not proper prefix of nother word in N d (w) } N 1 () = {,,,,,,,,,,,,,, } It suffices to find the words in the condensed neighorhood. But how do you do tht efficiently, including finding them in the index? Compute rows of dynmic progrmming mtrix s one trverses the trie of ll strings over Σ
32 Condensed Neighorhoods v? w Done: 1 in the right corner
33 Condensed Neighorhoods v w Only need ±1 nd!
34 Condensed Neighorhoods _ v w _
35 _ Condensed Neighorhoods _1 _ _ If ll entries re d then wsting time on D.P. _ _ _ _ _ _
36 _ Condensed Neighorhoods _1 _ _ If ll entries re d then wsting time on D.P. _ _ _ _ _ _
37 Condensed Neighorhoods _ _ _ _
38 Condensed Neighorhoods Use KMP on reverse of w to efficiently discover these. _ _ A shorter suffix of w tht is prefix of the extension is lso possile
39 Condensed Neighorhoods _ _ _ _ Lemm: Neighorhoods nd their hits in A cn e generted in O(zd+h) time where z = Nd(w)
40 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend My 89: The TRW Chip & The Cigrette Brek Fll 89: Blst is Born Fll 89: The Splitting Lemm Fll 89: Seed & Extend y Douling Spr 90: Generting Condensed Neighorhoods
41 The Story Mrch 88: The Lister Hill Meeting & Glil s 2 questions June 88: Seed & Extend My 89: The TRW Chip & The Cigrette Brek Fll 89: Blst is Born Fll 89: The Splitting Lemm Fll 89: Seed & Extend y Douling Spr 90: Generting Condensed Neighorhoods Fll 90: Finle: Complexity
42 Complexity How ig is N d (k)? Developed recurrence for non-redundnt edit scripts: () DI = S () DS = SD (c) IS = SI (d) ID = Φ Lemm: d-1 S(k,d) = S(k-1,d) + (Σ-1)S(k-1,d-1) + (Σ-1) Σ Σ j S(k-1,d-1) d-2 j=0 + (Σ-1) 2 Σ j d-1 Σ S(k-2,d-2-j) + Σ S(k-2-j,d-1-j) j=0 j=0 N d (k) S(k,d) + d Σ j=1 Σ j S(k-1,d-j)
43 Complexity So how ig is it? Lemm: Nε(k) α(ε) k where α(ε) = Σ pow(ε) nd pow(ε) = logσ ( c(ε)+1 c(ε) 1 nd c(ε) = ε -1 + (1 + ε -2 ).5 ) + ε logσ c(ε) + ε Also Pr(w in Nε(k)) = O( 1 / β(ε) k ) where β(ε) = Σ 1-pow(ε)
44 pow(ε)
45 So how ig is it? Lemm: Complexity Nε(k) = O( α k ) where α = Σ pow(ε) Pr(w in Nε(k)) = O( 1 / β k ) where β = Σ 1-pow(ε) Strts t 1 (ε=0) nd grows Flex fctor Strts t Σ (ε=0) nd shrinks Effective lphet size And when k = logσ n? Nε(k) = O(n pow(ε) ) nd Pr(w in Nε(k)) = O(n pow(ε)-1 )
46 The Result Theorem: Given () A is effectively Bernouilli, () simple O(n) spce, precomputed index of A, nd (c) there re h d-mtches of query Q to A then they cn e found in O(d n pow(ε) log n + pd h) expected-time.
47 To my knowledge no one hs improved on this in the lst 20 yers!?! Algorithmic 12, 4-5 (1994), ! (sumitted 1991! )! A recent retrospective:! Computtionl Biology 19, (Springer-Verlg 2013), 3-15.
Alignment of Long Sequences. BMI/CS Spring 2016 Anthony Gitter
Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostt.wisc.edu Gols for Lecture Key concepts how lrge-scle lignment differs from the simple cse the
More informationFirst Midterm Examination
24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet
More informationFirst Midterm Examination
Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does
More informationComputing the Optimal Global Alignment Value. B = n. Score of = 1 Score of = a a c g a c g a. A = n. Classical Dynamic Programming: O(n )
Alignment Grph Alignment Mtrix Computing the Optiml Globl Alignment Vlue An Introduction to Bioinformtics Algorithms A = n c t 2 3 c c 4 g 5 g 6 7 8 9 B = n 0 c g c g 2 3 4 5 6 7 8 t 9 0 2 3 4 5 6 7 8
More informationIntermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4
Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one
More information1 From NFA to regular expression
Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work
More informationConvert the NFA into DFA
Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:
More information1 Nondeterministic Finite Automata
1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you
More informationFormal languages, automata, and theory of computation
Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm
More informationHomework 3 Solutions
CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.
More informationCS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University
CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted
More informationWhere did dynamic programming come from?
Where did dynmic progrmming come from? String lgorithms Dvid Kuchk cs302 Spring 2012 Richrd ellmn On the irth of Dynmic Progrmming Sturt Dreyfus http://www.eng.tu.c.il/~mi/cd/ or50/1526-5463-2002-50-01-0048.pdf
More informationKleene s Theorem. Kleene s Theorem. Kleene s Theorem. Kleene s Theorem. Kleene s Theorem. Kleene s Theorem 2/16/15
Models of Comput:on Lecture #8 Chpter 7 con:nued Any lnguge tht e defined y regulr expression, finite utomton, or trnsi:on grph cn e defined y ll three methods We prove this y showing tht ny lnguge defined
More informationModule 9: Tries and String Matching
Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer
More informationModule 9: Tries and String Matching
Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer
More informationLet's start with an example:
Finite Automt Let's strt with n exmple: Here you see leled circles tht re sttes, nd leled rrows tht re trnsitions. One of the sttes is mrked "strt". One of the sttes hs doule circle; this is terminl stte
More informationGrammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages
5//6 Grmmr Automt nd Lnguges Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Prof. Mohmed Hmd Softwre Engineering L. The University of Aizu Jpn Regulr Lnguges Context Free Lnguges Context Sensitive
More informationRegular expressions, Finite Automata, transition graphs are all the same!!
CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1
More informationCHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)
Finite Automt (FA or DFA) CHAPTER Regulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, equivlence of NFAs DFAs, closure under regulr
More information12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014
CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple
More information5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.
Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.
More informationNondeterminism and Nodeterministic Automata
Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely
More informationCS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions
CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2016 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 9 1. (4pts) ((p q) (q r)) (p r), prove tutology using truth tles. p
More informationAlgorithms in Computational. Biology. More on BWT
Algorithms in Computtionl Biology More on BWT tody Plese Lst clss! don't forget to submit And by next (vi emil, repo ) implementtion week or shre prgectfltw get Not I would like reding overview! Discuss
More informationCSCI 340: Computational Models. Transition Graphs. Department of Computer Science
CSCI 340: Computtionl Models Trnsition Grphs Chpter 6 Deprtment of Computer Science Relxing Restrints on Inputs We cn uild n FA tht ccepts only the word! 5 sttes ecuse n FA cn only process one letter t
More informationLexical Analysis Finite Automate
Lexicl Anlysis Finite Automte CMPSC 470 Lecture 04 Topics: Deterministic Finite Automt (DFA) Nondeterministic Finite Automt (NFA) Regulr Expression NFA DFA A. Finite Automt (FA) FA re grph, like trnsition
More informationLexical Analysis Part III
Lexicl Anlysis Prt III Chpter 3: Finite Automt Slides dpted from : Roert vn Engelen, Florid Stte University Alex Aiken, Stnford University Design of Lexicl Anlyzer Genertor Trnslte regulr expressions to
More informationCS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata
CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or
More informationBalanced binary search trees
02110 Inge Li Gørtz Overview Blnced binry serch trees: Red-blck trees nd 2-3-4 trees Amortized nlysis Dynmic progrmming Network flows String mtching String indexing Computtionl geometry Introduction to
More informationAssignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages
Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd
More informationWorked out examples Finite Automata
Worked out exmples Finite Automt Exmple Design Finite Stte Automton which reds inry string nd ccepts only those tht end with. Since we re in the topic of Non Deterministic Finite Automt (NFA), we will
More informationFaster Regular Expression Matching. Philip Bille Mikkel Thorup
Fster Regulr Expression Mtching Philip Bille Mikkel Thorup Outline Definition Applictions History tour of regulr expression mtching Thompson s lgorithm Myers lgorithm New lgorithm Results nd extensions
More informationCSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science
CSCI 340: Computtionl Models Kleene s Theorem Chpter 7 Deprtment of Computer Science Unifiction In 1954, Kleene presented (nd proved) theorem which (in our version) sttes tht if lnguge cn e defined y ny
More informationDesigning finite automata II
Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of
More informationHarvard University Computer Science 121 Midterm October 23, 2012
Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is
More informationNFAs continued, Closure Properties of Regular Languages
Algorithms & Models of Computtion CS/ECE 374, Fll 2017 NFAs continued, Closure Properties of Regulr Lnguges Lecture 5 Tuesdy, Septemer 12, 2017 Sriel Hr-Peled (UIUC) CS374 1 Fll 2017 1 / 31 Regulr Lnguges,
More informationRegular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*
Regulr Expressions (RE) Regulr Expressions (RE) Empty set F A RE denotes the empty set Opertion Nottion Lnguge UNIX Empty string A RE denotes the set {} Alterntion R +r L(r ) L(r ) r r Symol Alterntion
More informationFormal Languages and Automata
Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University
More informationChapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1
Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more
More informationChapter 2 Finite Automata
Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht
More informationAUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton
25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q
More informationCS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)
CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts
More informationCS103 Handout 32 Fall 2016 November 11, 2016 Problem Set 7
CS103 Hndout 32 Fll 2016 Novemer 11, 2016 Prolem Set 7 Wht cn you do with regulr expressions? Wht re the limits of regulr lnguges? On this prolem set, you'll find out! As lwys, plese feel free to drop
More information12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016
CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple
More informationCS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS
CS 310 (sec 20) - Winter 2003 - Finl Exm (solutions) SOLUTIONS 1. (Logic) Use truth tles to prove the following logicl equivlences: () p q (p p) (q q) () p q (p q) (p q) () p q p q p p q q (q q) (p p)
More informationDATA Search I 魏忠钰. 复旦大学大数据学院 School of Data Science, Fudan University. March 7 th, 2018
DATA620006 魏忠钰 Serch I Mrch 7 th, 2018 Outline Serch Problems Uninformed Serch Depth-First Serch Bredth-First Serch Uniform-Cost Serch Rel world tsk - Pc-mn Serch problems A serch problem consists of:
More informationCS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018
CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA
More informationCS S-12 Turing Machine Modifications 1. When we added a stack to NFA to get a PDA, we increased computational power
CS411-2015S-12 Turing Mchine Modifictions 1 12-0: Extending Turing Mchines When we dded stck to NFA to get PDA, we incresed computtionl power Cn we do the sme thing for Turing Mchines? Tht is, cn we dd
More informationFast Frequent Free Tree Mining in Graph Databases
The Chinese University of Hong Kong Fst Frequent Free Tree Mining in Grph Dtses Peixing Zho Jeffrey Xu Yu The Chinese University of Hong Kong Decemer 18 th, 2006 ICDM Workshop MCD06 Synopsis Introduction
More informationCMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014
CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA
More informationCS 330 Formal Methods and Models
CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2017 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 2 1. Prove ((( p q) q) p) is tutology () (3pts) y truth tle. p q p q
More informationGNFA GNFA GNFA GNFA GNFA
DFA RE NFA DFA -NFA REX GNFA Definition GNFA A generlize noneterministic finite utomton (GNFA) is grph whose eges re lele y regulr expressions, with unique strt stte with in-egree, n unique finl stte with
More information1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.
York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech
More informationTries and suffixes trees
Trie: A dt-structure for set of words Tries nd suffixes trees Alon Efrt Comuter Science Dertment University of Arizon All words over the lhet Σ={,,..z}. In the slides, let sy tht the lhet is only {,,c,d}
More informationFree groups, Lecture 2, part 1
Free groups, Lecture 2, prt 1 Olg Khrlmpovich NYC, Sep. 2 1 / 22 Theorem Every sugroup H F of free group F is free. Given finite numer of genertors of H we cn compute its sis. 2 / 22 Schreir s grph The
More informationNFAs continued, Closure Properties of Regular Languages
lgorithms & Models of omputtion S/EE 374, Spring 209 NFs continued, losure Properties of Regulr Lnguges Lecture 5 Tuesdy, Jnury 29, 209 Regulr Lnguges, DFs, NFs Lnguges ccepted y DFs, NFs, nd regulr expressions
More informationCMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)
CMSC 330: Orgniztion of Progrmming Lnguges DFAs, nd NFAs, nd Regexps (Oh my!) CMSC330 Spring 2018 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All
More informationState Minimization for DFAs
Stte Minimiztion for DFAs Red K & S 2.7 Do Homework 10. Consider: Stte Minimiztion 4 5 Is this miniml mchine? Step (1): Get rid of unrechle sttes. Stte Minimiztion 6, Stte is unrechle. Step (2): Get rid
More informationMinimal DFA. minimal DFA for L starting from any other
Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA
More informationThoery of Automata CS402
Thoery of Automt C402 Theory of Automt Tle of contents: Lecture N0. 1... 4 ummry... 4 Wht does utomt men?... 4 Introduction to lnguges... 4 Alphets... 4 trings... 4 Defining Lnguges... 5 Lecture N0. 2...
More informationCS 330 Formal Methods and Models
CS 330 Forml Methods nd Models Dn Richrds, section 003, George Mson University, Fll 2017 Quiz Solutions Quiz 1, Propositionl Logic Dte: Septemer 7 1. Prove (p q) (p q), () (5pts) using truth tles. p q
More informationTable of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings...
Tle of contents: Lecture N0.... 3 ummry... 3 Wht does utomt men?... 3 Introduction to lnguges... 3 Alphets... 3 trings... 3 Defining Lnguges... 4 Lecture N0. 2... 7 ummry... 7 Kleene tr Closure... 7 Recursive
More informationFingerprint idea. Assume:
Fingerprint ide Assume: We cn compute fingerprint f(p) of P in O(m) time. If f(p) f(t[s.. s+m 1]), then P T[s.. s+m 1] We cn compre fingerprints in O(1) We cn compute f = f(t[s+1.. s+m]) from f(t[s.. s+m
More informationFinite Automata Approach to Computing All Seeds of Strings with the Smallest Hamming Distance
IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_05 Finite Automt Approch to Computing All Seeds of Strings with the Smllest Hmming istnce Ondřej Guth, Bořivoj Melichr Astrct Seed is type
More informationThe size of subsequence automaton
Theoreticl Computer Science 4 (005) 79 84 www.elsevier.com/locte/tcs Note The size of susequence utomton Zdeněk Troníček,, Ayumi Shinohr,c Deprtment of Computer Science nd Engineering, FEE CTU in Prgue,
More informationConnected-components. Summary of lecture 9. Algorithms and Data Structures Disjoint sets. Example: connected components in graphs
Prm University, Mth. Deprtment Summry of lecture 9 Algorithms nd Dt Structures Disjoint sets Summry of this lecture: (CLR.1-3) Dt Structures for Disjoint sets: Union opertion Find opertion Mrco Pellegrini
More informationFinite-State Automata: Recap
Finite-Stte Automt: Recp Deepk D Souz Deprtment of Computer Science nd Automtion Indin Institute of Science, Bnglore. 09 August 2016 Outline 1 Introduction 2 Forml Definitions nd Nottion 3 Closure under
More informationName Ima Sample ASU ID
Nme Im Smple ASU ID 2468024680 CSE 355 Test 1, Fll 2016 30 Septemer 2016, 8:35-9:25.m., LSA 191 Regrding of Midterms If you elieve tht your grde hs not een dded up correctly, return the entire pper to
More informationCS375: Logic and Theory of Computing
CS375: Logic nd Theory of Computing Fuhu (Frnk) Cheng Deprtment of Computer Science University of Kentucky 1 Tle of Contents: Week 1: Preliminries (set lger, reltions, functions) (red Chpters 1-4) Weeks
More informationFinite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018
Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More
More information1.4 Nonregular Languages
74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll
More informationBases for Vector Spaces
Bses for Vector Spces 2-26-25 A set is independent if, roughly speking, there is no redundncy in the set: You cn t uild ny vector in the set s liner comintion of the others A set spns if you cn uild everything
More informationHomework 4. 0 ε 0. (00) ε 0 ε 0 (00) (11) CS 341: Foundations of Computer Science II Prof. Marvin Nakayama
CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 4 1. UsetheproceduredescriedinLemm1.55toconverttheregulrexpression(((00) (11)) 01) into n NFA. Answer: 0 0 1 1 00 0 0 11 1 1 01 0 1 (00)
More informationFinite Automata-cont d
Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww
More informationCISC 4090 Theory of Computation
9/6/28 Stereotypicl computer CISC 49 Theory of Computtion Finite stte mchines & Regulr lnguges Professor Dniel Leeds dleeds@fordhm.edu JMH 332 Centrl processing unit (CPU) performs ll the instructions
More informationAutomata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.
Outline Automt Theory 101 Rlf Huuck Introduction Finite Automt Regulr Expressions ω-automt Session 1 2006 Rlf Huuck 1 Session 1 2006 Rlf Huuck 2 Acknowledgement Some slides re sed on Wolfgng Thoms excellent
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic
More informationClosure Properties of Regular Languages
Closure Properties of Regulr Lnguges Regulr lnguges re closed under mny set opertions. Let L 1 nd L 2 e regulr lnguges. (1) L 1 L 2 (the union) is regulr. (2) L 1 L 2 (the conctention) is regulr. (3) L
More informationTypes of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt
More informationTypes of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt
More informationBayesian Networks: Approximate Inference
pproches to inference yesin Networks: pproximte Inference xct inference Vrillimintion Join tree lgorithm pproximte inference Simplify the structure of the network to mkxct inferencfficient (vritionl methods,
More informationLecture 08: Feb. 08, 2019
4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny
More informationSome Theory of Computation Exercises Week 1
Some Theory of Computtion Exercises Week 1 Section 1 Deterministic Finite Automt Question 1.3 d d d d u q 1 q 2 q 3 q 4 q 5 d u u u u Question 1.4 Prt c - {w w hs even s nd one or two s} First we sk whether
More information5.1 How do we Measure Distance Traveled given Velocity? Student Notes
. How do we Mesure Distnce Trveled given Velocity? Student Notes EX ) The tle contins velocities of moving cr in ft/sec for time t in seconds: time (sec) 3 velocity (ft/sec) 3 A) Lel the x-xis & y-xis
More informationReview of Gaussian Quadrature method
Review of Gussin Qudrture method Nsser M. Asi Spring 006 compiled on Sundy Decemer 1, 017 t 09:1 PM 1 The prolem To find numericl vlue for the integrl of rel vlued function of rel vrile over specific rnge
More informationCHAPTER 1 Regular Languages. Contents
Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr
More informationTheory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38
Theory of Computtion Regulr Lnguges (NTU EE) Regulr Lnguges Fll 2017 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of Finite Automt A finite utomton hs finite set of control
More informationIs there an easy way to find examples of such triples? Why yes! Just look at an ordinary multiplication table to find them!
PUSHING PYTHAGORAS 009 Jmes Tnton A triple of integers ( bc,, ) is clled Pythgoren triple if exmple, some clssic triples re ( 3,4,5 ), ( 5,1,13 ), ( ) fond of ( 0,1,9 ) nd ( 119,10,169 ). + b = c. For
More informationAutomata and Languages
Automt nd Lnguges Prof. Mohmed Hmd Softwre Engineering Lb. The University of Aizu Jpn Grmmr Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Regulr Lnguges Context Free Lnguges Context Sensitive
More informationa,b a 1 a 2 a 3 a,b 1 a,b a,b 2 3 a,b a,b a 2 a,b CS Determinisitic Finite Automata 1
CS4 45- Determinisitic Finite Automt -: Genertors vs. Checkers Regulr expressions re one wy to specify forml lnguge String Genertor Genertes strings in the lnguge Deterministic Finite Automt (DFA) re nother
More informationMath 4310 Solutions to homework 1 Due 9/1/16
Mth 4310 Solutions to homework 1 Due 9/1/16 1. Use the Eucliden lgorithm to find the following gretest common divisors. () gcd(252, 180) = 36 (b) gcd(513, 187) = 1 (c) gcd(7684, 4148) = 68 252 = 180 1
More informationList all of the possible rational roots of each equation. Then find all solutions (both real and imaginary) of the equation. 1.
Mth Anlysis CP WS 4.X- Section 4.-4.4 Review Complete ech question without the use of grphing clcultor.. Compre the mening of the words: roots, zeros nd fctors.. Determine whether - is root of 0. Show
More informationLecture 3: Equivalence Relations
Mthcmp Crsh Course Instructor: Pdric Brtlett Lecture 3: Equivlence Reltions Week 1 Mthcmp 2014 In our lst three tlks of this clss, we shift the focus of our tlks from proof techniques to proof concepts
More informationThe Minimum Label Spanning Tree Problem: Illustrating the Utility of Genetic Algorithms
The Minimum Lel Spnning Tree Prolem: Illustrting the Utility of Genetic Algorithms Yupei Xiong, Univ. of Mrylnd Bruce Golden, Univ. of Mrylnd Edwrd Wsil, Americn Univ. Presented t BAE Systems Distinguished
More informationCS 188: Artificial Intelligence Spring 2007
CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment
More informationCS 311 Homework 3 due 16:30, Thursday, 14 th October 2010
CS 311 Homework 3 due 16:30, Thursdy, 14 th Octoer 2010 Homework must e sumitted on pper, in clss. Question 1. [15 pts.; 5 pts. ech] Drw stte digrms for NFAs recognizing the following lnguges:. L = {w
More informationFinite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh
Finite Automt Informtics 2A: Lecture 3 Mry Cryn School of Informtics University of Edinburgh mcryn@inf.ed.c.uk 21 September 2018 1 / 30 Lnguges nd Automt Wht is lnguge? Finite utomt: recp Some forml definitions
More informationPrefix-Free Regular-Expression Matching
Prefix-Free Regulr-Expression Mthing Yo-Su Hn, Yjun Wng nd Derik Wood Deprtment of Computer Siene HKUST Prefix-Free Regulr-Expression Mthing p.1/15 Pttern Mthing Given pttern P nd text T, find ll sustrings
More informationNFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.
NFA DFA Exmple 3 CMSC 330: Orgniztion of Progrmming Lnguges NFA {B,D,E {A,E {C,D {E Finite Automt, con't. R = { {A,E, {B,D,E, {C,D, {E 2 Equivlence of DFAs nd NFAs Any string from {A to either {D or {CD
More informationState space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies
Stte spce systems nlysis (continued) Stbility A. Definitions A system is sid to be Asymptoticlly Stble (AS) when it stisfies ut () = 0, t > 0 lim xt () 0. t A system is AS if nd only if the impulse response
More information