The NFA Segments Scan Algorithm
|
|
- Roderick Tucker
- 6 years ago
- Views:
Transcription
1 The NFA Segments Scan Algorithm Omer Barkol, David Lehavi HP Laboratories HPL Keyword(s): formal languages; regular expression; automata Abstract: We present a novel way for parsing text with non deterministic finite automatons. For "real life" regular expressions and text, our algorithm scans only a fraction of the characters, and performs a small number of operations for each of these characters (for synthetic worse case scenarios, it would perform worse than classical algorithms). Although there are similar approaches, our algorithm is far simpler and less resource consuming than the alternatives we are aware of. External Posting Date: February 21, 2014 [Fulltext] Internal Posting Date: February 21, 2014 [Fulltext] Approved for External Publication Copyright 2014 Hewlett-Packard Development Company, L.P.
2 The NFA segments scan algorithm Omer Barkol and David Lehavi HP Labs Israel Abstract. We present a novel way for parsing text with non deterministic finite automatons. For real life regular expressions and text, our algorithm scans only a fraction of the characters, and performs a small number of operations for each of these characters (for synthetic worse case scenarios, it would perform worse than classical algorithms). Although there are similar approaches, our algorithm is far simpler and less resource consuming than the alternatives we are aware of. 1 Introduction The pattern matching problem calls for discovering if a given string x is in a language L. In the case where L is the language of strings containing a given word as a substring, there are several fast (and by now classical) algorithms - see [BM], and [KMP]. In the case where L the language of strings containing one word out of a given set, there are two (again, rather classical) algorithms - see [AC], [C-W]. In the case where L is a general regular language, we are aware of two approaches: a rather complicated approached presented in [WW], and a rather simple but resource consuming one presented in [Ke] (in which one has to maintain an entire suffix tree for each state in the automaton corresponding to the regular language). Our approach is somewhat similar to the one presented in [Ke], but it avoids the big memory overhead, and is easier to analze and generalize. Our method is motivated by our observations on the strcture of real life regular expressions on the one hand, and the moral of the Boyer-Moore algorithm on the other: Many real life regular expressions are composed of contigous words connected by either or operations, or Kleen-* operations on a single node in the automaton. Addapting the Boyer-Moore phylosophy, we match the words between the Kleen-* s, and jump ahead in order to match the next word(s). We summarized this approach the following principles: At any stage of the execution, one should hold all the possible sub-matches of the processed sub-string to the automaton. Regarding non-deterministic finite automaton as a directed graph, instead of storing a sub-matche as a path on the automaton, store the path s endpoints. Instead of advancing one character at a time over easily matched pieces of the string/automaton and matching a contiguous pieces of the string to a path, one can jump and attempt to match another piece of the string to another part of the automaton, and then these paths should be glued.
3 2 Preliminaries Notations: For each string x we denote the ith character of x by x i. We denote by [i, j] the segment (or set) of integers {i, i + 1,..., j 1, j}. Given a non deterministic automaton M = (Q, Σ,, q 0, {q F }) we define a path on the automaton to be a map f : [i, j] Q, such that for each i k < j, f(k + 1) (f(k)). We then say that f[i, j] is a path from q to q if f(i) = q and f(j) = q. We define the distance between two states q, q Q: dist(q, q ) := min f[i,j] a path from q to q (j i). We assume that M does not admit sinks (i.e. nodes with self loops as the only outgoing edges). Before adding some less standard definitions to support our algorithm we present the Thompson algorithm as presented in [Th]: Algorithm 1: Thompson s algorithm Input: x = (x 0,... x x 1 ) Σ x, M Output: True if the M matches x; False o.w. p 0 Z {q o } while p < x do if q F Z then return True // match! Z {q q Z : q (q, x p )} // crawl right p p + 1 if Z = then return False // no active states end return False // reached end of string We now add a few non-standard definitions. φ(q) = True σ Σ : q (q, σ) or q = q F. ψ(q) = True x Σ, x ɛ : q (q, x) or q = q 0, or q = q F. Example 1. If our automaton is the one corresponding to the regular expression.*a.*bca described by the following diagram:.. q a start 0 q b 1 q c 2 q a 3 q F then φ(q) = ψ(q) = True if and only if q {q 0, q 1, q F }. Intuitively and keeping our motivation (presented in the introduction) in mind one should think about nodes satisfying φ as nodes from which we always want to jump. There is no point in reading one character at a time if the only possible path we are trying to extend is one ending with a node satisfying φ(q); in a sense this is exactly the Boyer-Moore approach: if the match of the pattern fails and you have to start looking from the beginning of the pattern (which
4 is the equivalent of a node satisfying φ(q)), you should jump as far as possible before starting. The intuition behind nodes satisfying ψ(q) is more heuristic: these are nodes which typically and in real-life have more chance to match. From an algorithmic point of view, the difference between nodes satisfying φ(q), and one satisfying ψ(q), is that once we jump from a node satisfying ψ but not φ, we have to glue back-segments. We also define the following for every q Q: Ψ(q) def = {q path f : [1, m] Q from q to q, j [2, m 1] : ψ(f(j))} l(q) def = max q Ψ(q) dist(q, q ). I.e, in Example 1 above we would have Ψ(q 0 ) = {q 0, q 1 }, Ψ(q 1 ) = {q 1, q 2, q 3, q F }, Ψ(q 2 ) = {q 2, q 3, q F }, Ψ(q 3 ) = {q 3, q F }, Ψ(q F ) = {q F }, l(q 0 ) = 1, l(q 1 ) = 3, l(q 2 ) = 2, l(q 3 ) = 1, l(q F ) = 0. Note that as M does not admit sinks l > 0 for any non terminal state (either accepting or not). Note also that Ψ, l are computable using reverse BFS from q F. The set Ψ(q) is simply the set of nodes on the automaton which are reachable from q but do not satisfy ψ(q) to which we jump (i.e. start a new match to another path) if our current path ends with a node satisfying ψ(q), whereas l(q) is the size of the jump on the string. Given a string x Σ, a finite disjoint union I = t i=1 [a i, b i ] of ordered intervals inside N, we define P I to be the set of maps f : I Q such that 0 I f(0) = q 0, and i : f(a i+1 ) Ψ(f(b i )), k i [a i, b i 1] : f(k + 1) (f(k), x k ). E.g. in the automaton from Example 1, given the string bad and writing functions as sets of ordered pairs we have P [0,2] [5,5] = {{(0, q 0 ), (1, q 0 ), (2, q 1 ), (5, q 1 )}, {(0, q 0 ), (1, q 0 ), (2, q 1 ), (5, q 2 )}, {(0, q 0 ), (1, q 0 ), (2, q 1 ), (5, q 3 )}, {(0, q 0 ), (1, q 0 ), (2, q 1 ), (5, q F )}}. Finally we define S I to be set of pairs of end points of the non contigous segments of function of P I ; i.e. for the automaton in Example 1 and the same string as above we would have S [0,2] [5,5] = {{(q 0, q 1 )}, {(q 0, q 1 ), (q 2, q 2 )}, {(q 0, q 1 ), (q 3, q 3 )}, {(q 0, q 1 ), (q F, q F )}}. 3 The Segments Scan Algorithm We are now ready to present our segments automaton scan algorithm. The algorithm implicitly uses interval unions I of only two intervals: [0, p] [p + b, p + e]
5 (intuitively p represents a point which Thompson s algorithm surely arrived, and the second segment is eventually supposed to glue to the first; we jump ahead in this fashion for the same reason we do so in Boyer-Moores algorithm: if we fail we want to do so while advancing as much as we can on the string). Instead of finding a compact representation of P I we find a compact representation of S I : the possible f(p) (intuitevly: the points we jump from will be represented by a union of two sets A, B depending if the nodes satisfies φ or not). The pairs corresponding to the second interval of S I will be represented by a set of pairs denote S (continuing with our example from above, for S [0,2] [5,5] we have S = {(q 1, q 1 ), (q 2, q 2 ), (q 3, q 3 ), (q F, q F )}). We denote π 2 S = {q q : (q, q) S} (i.e. the projection on the second coordinate); we use the analog notation π 1 for projection on the first coordinate. We set O, O to be two boolean oracles (note that usualy these oracles are simple functions depending on S, ψ, φ - see Example 2 below): Algorithm 2: The Segments Scan Algorithm Input: x = (x 0,... x x 1 ) Σ x, M, preprocessed data φ, ψ, Ψ, l Output: True if the M matches x; False o.w. p, b, e 0 S {(q 0, q 0 )} A B while p + e x do if b = 0 then if q F π 2 S then return True // match! if O or p = e = 0 then // jump B π 2 S A A {q B φ(q)} p p + e b, e min q A B l(q ) S {(q, q ) q A B : q Ψ(q)} if q F π 2 S and e > b and O then // string end or crawl right if p + e = x then return False // reached end of string S {(q, q) q : (q, q ) S, q (q, x p+e )} e e + 1 else // crawl left b b 1 S {(q, q) q : (q, q) S, q (q, x p+b )} if b = 0 then S {(q, q) S q A B} // glue if S = then return False // no active segments if (q, q) S : φ(q ) then b 0 end return False // passed end of string
6 Example 2. In our experiments (see Section 5) we considered three pairs of oracles: O 1 : b = 0 or q π 2 S : ψ(q) O 1 : q π 2 S : ψ(q) O 2 : b = 0 O 2 : q π 2 S : ψ(q) O 3 : False O 3 : True Theorem 1. The segments scan algorithm returns the same truth value as the Thompson algorithm. In order to prove the theorem we first claim two Lemmas: Lemma 1. Immediately after line 6 in Thompson s algorithm, the set Z is the set of endpoints of the paths in P [0,p]. The proof of this Lemma is quite straightforward, and see [Th] for the details. We turn to prove the invariant our algorithm holds: Lemma 2. Immediately after each of the lines 13, 17, 21, 23 in the segments automaton scan algorithm: f P [0,p] [p+b,p+e] : either f(p) A B or (b = 0 and f(p) A B π 1 S), moreover: (f(p + b), f(p + e)) S, (q, q ) S : f P [0,p] [p+b,p+e] : f(p + b) = q, f(p + e) = q, f(p) A B. Proof. The initialized values are I = {[0, 0]}, p, b, e = 0, A, B =, and S = {(q 0, q 0 )}; thus as for all f P [0,0] : f(0) = q 0 the two properties hold. We will show that if the properties hold in the beginning of the loop (i.e. at line 6) it holds in each of the other lines. We thus induct on the number of times we encounter each of this lines. We now separate to cases: After line 13: We get to this line either by not entering the if statement of line 6, which means all values are the same and thus the hypothesis holds, or we did, and then b = 0. We now use the fact that: [0, p] [p + b, p + e] = [0, p] [p + 0, p + e] = [0, p + e] P [0,p] [p+0,p+e] = P [0,p+e]. By the induction hypothesis, (f(p + b), f(p + e)) S, and thus for the new p = p + e it holds that f(p ) = f(p + e) π 2 S = B. Also, for the new values of b, e, note that b = e, and that by the assignment of line 13 (f(p + b ), f(p + e )) = (f(p + b ), f(p + b )) S. Also, the only (q, q ) that where added in those lines to S are (f(p + b ), f(p + b )) where obviously, f(p ) A B. After line 17: By the induction hypothesis, the claim holds when the program was last before line 15; it holds after line 17 by the definition of. After line 21: Here we seperate to two cases: If b > 1, then by the induction hypothesis, the claim holds when the program was last before line 18; it holds before line 21 by the definition of, and the then part of the if in line 21 is never executed.
7 If b = 1 (note that we never get to this line with b = 0), then - by the definition of, the only way in which the induction hypothesis is violated before line 21, is that the paths paremetrized by S may not glue at p to the paths parameterized by A B; i.e. at before line 21 there are pairs (q, q ) S such that q A B, which means that the functions (from [p, p + e] to Q) parameterized by the pair (q, q ) do not glue at p to any of the functions parameterized by A B. However, this issue is ammended by line 21, where we get rid of the bad pairs (implicitly, by setting a unique value for f(p), the one which comes from the set A B). Thus the hypothesis hold after this line. After line 23: By the induction hypothesis, the claim holds when the program was last before line 23; it holds after the line by the definition of φ: indeed if (q, q ) S, and f : [p + b, p + e] Q, then since σ Σ : q (q, σ) the function f can be trivially extended to [p + b 1, p + e] by setting f(p + b 1) to q; we conclude this argument using a decsending induction on b. Proof ( of Theorem 1). By the Lemmas 1 and 2, if we get to line 7 in the segments scan algorithm, then using parameter values from the segments scan algorithm Thompson s algorithm reached place p Thompson = p+e on the string, with front Z = π 2 S. The first conclusion we draw from this fact, is that if the segments scan algorithm exists succesfully on line 7, then so does Thompson s algorithm. As for the other direction, assume that the segments scan algorithm exists unsuccesfully. We will analyze what happened between the last time the algorithm visited line 7 and the exit point, and prove that Thompson s algorithm exists unsucessfully as well. Let A, B, p be as they were set after the last visit to line 12, and arguing by contradiction and assuming that Thompson s algorithm exit succesfully let p T-final be the number of iterations of Thompson s algorithm, then by Lemma 2 there is a map f P [p,pt-final ] such that f(p) A B and f(p T-final ) = q F. By (decending) induction on (on a a below) this means that for all a, a such that p a a p T-final, the following two properties hold: P [0,p] [a,a ], and f P [0,p] [a,pt-final ] : f(p T-final ) = q F. By the first of these properties we do not pass the if in line 22 before hitting line 7 again, and by the second of these p + e cannot excceed p T-final (see the condition in line 14) before hitting 7 again contradicting the assumption that we already reached this line for the last time. 3.1 Prunning of redundant extenions There are four minor changes in the algorithm which may allways be used to trim down the sizes of A, B and S. For the sake of simplicity of the exposition we omitted them from the initial algorithm presentation. The four changes we
8 present are by and large independant of one another (we explicitly note when they are not). 1. Currently the set A represents all the nodes in the front which satisfy φ; instead we can make A represent all the nodes q in the front which satisfy φ, that admit paths from them to q F which do not pass through other nodes of A (otherwise - why keep q? we can just keep these other points of A). I.e. imidiately after line 10 we modify A as follows: A {q A path f from q to q F : range(f) A = {q}}. Note that the predicate above may be precomputed before we execute the algorithm namely for each node q statisfying φ we may encode the set of nodes Φ(q) such that we erase q from A only if A Φ(q). 2. In the case where b = 0, the set π 1 S is glued to A B. Thus we may work directly with π 2 S instead of S. Representing π 2 S by B, we simply have to make the following changes: Line 2: Substitute by S. Line 4: Substitute by B {q 0 }. Line 7: Substitute the π 2 S in the condition by B. Line 9: Erase Line 16: Substitute by: if b=0 then B {q q B : q (q, x p+e )} else S {(q, q) q : (q, q ) S, q (q, x p+e )}. Line 21: Substitute the then part by B {q (q, q) S q A B}. 3. Ideally, instead of extending all the possible segments to the left, we would want to prune pairs (q, q) such that min q A B dist(q, q ) > b. While this goal is dificult to achieve in general, it is easy enough to prune some of these pairs, by modifing the update of S in the left crawl (line 21) to S {(q, q) q : (q, q) S, q (q, x p+b ) and (q q or q A B)}. Note that a similar change should be made in the modification to line 21 in the previous paragraph. 4. Finaly, allowing crawling to the right from states in A B bloughts the size of S. Such a crawl is redundant since we eventually crawl left to A B (either before performing another jump, or after it). Thus we can modify the update of S in line 13 to S {(q, q ) q A B : q Ψ(q), q q or q A}, and the condition line 22 to S = and A =. 3.2 Jumping, crawling, and oracles In line 14 we use the oracle O to decide whether crawling to the left is better than crawling to the right; where as in line 8 we use the oracle O to determine
9 when it s better to crawl to the right, and when to jump (in Section 5 we show test results for the three oracles presented in Example 2). All the example oracles we considred in Example 2 are motivated by Boyer-Moore; i.e. they have a bias to crawl to the left which is only violated in cases which do not occur in the absence of loops in the automaton. We do not know if this design of the oracles is optimal (or even close to optimal) for typical regular expression and for either typical or worst case string. While we are not sure on how to analyze worst case behaviour for typical regular expression, we are working on an approach that we hope will prove usefull both in the analysys, and in the design of better oracles for typical strings where typical here means generated by a Markov process (both for the string and the regular expression, where in the second one the Markov process is a heirarchical one on the application of regular expression gramtical rules). This approach is motivated by the run-time analysis of the Boyer-Moore algorithm in e.g. [B-YR] [S] [Ts]) for Markovian inputs. 3.3 Generalization: Segment unions of more than two segments Our algorithm works on unions of segments I which are a union of only two segments the first of which starts at 0. Hence, our algorithm either updates data about the segment which does not contain 0, or unites the two segments. However, we can modify the definition of Ψ(q) to Ψ k (q) = {q path f : [1, m] Q from q to q, #{j [2, m 1] : ψ(f(j))} k}, thus allowing the path to contain at most k nodes satisfying ψ(q) (possibly, but not necessarily, adding other requirement e.g. distance, or simply being some special nodes on them). The algorithmic change there would be to work with Is which contain more than two segments; thus at each iteration of the main loop we would have to decide not between extensing to left or right side of the second segment of S I which is currently represented by S but between exending the left or right side of any of the segments except the first one. As we can store more states of the NFA at once, and if some state scans are more likely to fail than others, this may be an advantage. As usual, this can be determined based on the NFA, the text, or both. 4 Qualitative analysis of the number of character reads, character comparisons, and the front size The number of character comparisons we perform in the segments scan algorithm #S for iteration i, i an iteration of the main loop (with some mild change if we use the second acceleration in 3.1, since in some iterations we have to add the size of B instead of S which is smaller), whereas
10 the number of character comparisons of Thompson s algorithm is #Z for place p. p place on string In Section 6 we discuss various alternatives of substituting the left and right crawling operations on the entire set S by a constant time operation. In this case, when comparing the segments scan algorithm to Thompson s, one simply has to compare the number of iterations of the segments scan algorithm which is the number of character reads it performs, and the length of the string up to acceptance/denail of the automaton which is the number of character reads Thompson s algorithm performs. Example 3 (Bad regex and input string for the segments scan algorithm). We note that given the regular expression b(.{10}a.{9}a..{8}a.{2}.{7}a.{3}.{6}a.{4}.{5}a.{5}.{4}a.{6}.{3}a.{7}.{2}a.{8}.a.{9} a.{10}) the input string aaaa..., and assuming the standard assumption above, our algorithm requires 10 times more character comparisons than Thompson s. Why are we presenting this algorithm then? Simply put, the acceleration of the segments scan algorithm comes from performing big jumps (thus reducing the number of iterations of the main loop), while not increasing the size of S by much in a way that will compensate on this reduction. We cannot make any statements which are true for all input strings and all NFAs (in 3.2 we discussed future plans on making better worst case analysys, as well as probabilistic quantitative statements, and how these statements would affect the algorithm). However, we do make two empirical claims which hold for regular expressions and input strings in real life : Most not very short input words correspond only to paths on the NFA which stay on nodes satisfying φ(q). Most not very short input words which correspond to paths on the NFA which do not stay on nodes satisfying φ(q), correspond to a small number of such paths. The effect of the first rule of thumb here is that we may perform big jumps, and that therefor the number of iterations of the algorithm is small. The effect of the second rule of thumb is that after crawling only a small number of letters, the size of the S is still small. 5 Experimental results In our experiments, we implemented the algorithm, and tested how many characters out of the input string x are actually read during the run of the segment scan algorithm given an automaton for different regular expressions. We
11 used the three oracles pairs from Example 2 which we denote in the table below simple by 1, 2, 3 and used the first three optimizations presented in 3.1, (but not the forth). We ran our searches on tests used by boost (see On the Mark Twain corpus with the following results: regex % for 1 % for 2 % for 3 Twain Huck[[:alpha:]] [[:alpha:]]+ing Tom Sawyer Tom Sawyer Huckleberry Finn (Tom Sawyer Huckleberry Finn).{0,30}river river.{0,30}(tom Sawyer Huckleberry Finn) and on test html searches benchmarks with the following results: regex % for 1 % for 2 % for 3 beman john dave <p>.*</p> <h[1-8][^>]*>.*</h[1-8]> <a[^>]+href=("[^"]*" [^[:space:]]+)[^>]*> <img[^>]+src=("[^"]*" [^[:space:]]+)[^>]*> <font[^>]+face=("[^"]*" [^[:space:]]+)[^>]*>.*</font> One can observe that the percentage of characters read in this test set even for complicated and very not word match like regular expressions (as in the last of the html example searches) where as low as 34%. Moreover, note that there is a significant difference depending on the oracle, and thus an oracle that can learn the input (regular expressions and common stings) might be much more efficient. 6 Accelerating the inner loops There are three inner loops in our algorithm: one in the computation of π 2 S (which is rather standard to accelerate), the left expansion line 16, and right expansion in line 21, where we compute {(q, q) q : (q, q ) S, q (q, x p+e )}, {(q, q) q : (q, q) S, q (q, x p+b )}, respectively. Reducing these loops from O(S) time operations to O(1) time operations changes the performance of the algorithm from the number of character comparisons to the number of character reads (see Section 4). In this section we consider two acceleration methods for these loops: the more conservative method is constructing the DFA corresponding to the segments scan algorithm; whereas
12 the more radical one are is realiance on a hardware incarnation of the original NFA. 6.1 Full DFA and hybrid execution Our algorithm is a complicated way to scan an automaton. Nevertheless, it is still an automaton scanning algorithm, and as such, it admits an underlying DFA. I.e. we can construct the corresponding DFA (whose size, in a worse case scenario, is O(size of the NFA squared). E.g. the segments DFA corresponding to the NFA from Example 1 is given by the following diagram (here we use the oracles O 1, O 1, and all four optimizations presented in 3.1 when computing S): start abc,3 S = {(q 0, q 0 ), (q 1, q 1 )} b = 1, e = 1 A = B = {q 0 } a,1 S = {(q 3, q F )} b = 2, e = 3 c S = {(q 2, q F )} b = 1, e = 3 b S =Irrelevant b = 0, e = 3 A = {q 1 }, B = {q F } b,3 a,1 c,3 a S = {(q 2, q 3 )} b = 2, e = 3 b S = {(q 1, q 3 )} b = 1, e = 3 a S = {(q 1, q F )} b = 1, e = 4 S = {(q 1, q 1 ), (q 2, q 2 ), (q 3, q 3 ), (q F, q F )} b = 3, e = 3 c b,2 a,2 b c,1 S = {(q 2, q F } b = 2, e = 3 c S = {(q 3, q F )} b = 2, e = 4 a S = {(q 1, q F )} b = 2, e = 5 a,2 Legened: In out-going edges from dashed nodes, we consider the charcater x p+e, where as in full nodes we consider the character x p+b. The first parameter on each edge is the accepting character of the edges; the second paramter if one exists is the increment of p. Finally note that we could represent the DFA partially, and run the scan in a hybrid mode (see [BC]). 6.2 Accelerating left and right expansions, given an NFA with front expansion in O(1) Assuming we have at our disposal an NFA implementation such that front expansion is done in O(1), (this assumption which is not that far from reality - see
13 e.g. [SP]). We may store S as a (possibly sparse) bit matrix C, and utilize the given NFA implementation to accelerate the left and right expansions: I.e. the right expansion C with the character σ is given by C [i, j] = k 1 (j,σ)c[i, k]. 7 Conclusion We have presented a new algorithm to match strings to regular expressions. This algorithm does evidently not perform well in the worst case, but rather is suited for real life regular expressions as we encounter in the industry (e.g., in security filtering scenarios or in IT monitoring scenarios, to name a couple). The algorithm is inspired by the Boyer-Moore algorithm, in jumping on hopeless matches. By doing so, it actually holds a set of segments in the automaton, that might be complementing the computation of the string by the automaton. We have shown that the algorithm is most suitable to be parallelized, and that it can be generalized in many different and permission ways. References AC. A. V. Aho, M. J. Corasick (June 1975) Efficient string matching: An aid to bibliographic search. Communications of the ACM June 1975, 18 (6): B-YR. R. A. Baeza-Yates, M. Régnier Average Running Time of the Boyer-Moore- Horspool Algorithm. Theor. Comput. Sci. 92(1): (1992) 8 BC. M. Becchi, P. Crowley A hybrid finite automaton for practical deep packet inspection. CoNEXT 2007: 1 11 BM. R.S. Boyer, S. J. Moore A fast string matching algorithm Comm. ACM. 01/1977; 20 (10): C-W. B. Commentz-Walter A String Matching Algorithm Fast on the Average ICALP 1979: Extended abstract. 1 G. Z. Galil On improving the worst case running time of the Boyer-Moore string matching algorithm. September 1979 Comm. ACM (New York, NY, USA: Association for Computing Machinery 22 (9): Ke. S. Kearns Accelerated Finite Automata Enable Regular Expression Searching in Sublinear Time 2013: preprint. 1 KMP. D. Knuth, J. H. Morris, V. Pratt Fast pattern matching in strings. SIAM Journal on Computing 1977, 6 (2): SP. R. P. S. Sidhu, V. K. Prasanna Fast Regular Expression Matching Using FPGAs. FCCM 2001: S. R. T. Smythe The Boyer-Moore-Horspool heuristic with Markovian input. Random Structures and Algorithms, Volume (2001) 8 Th. K. Thompson Programming Techniques: Regular expression search algorithm. Communications of the ACM 1968, 11 (6): , 5 Ts. Tsung-Hsi Tsai Average case analysis of the Boyer-Moore algorithm. Random Struct. Algorithms 28(4): (2006). 8 WW. B. W. Watson, R.E. Watson a Boyer-Moore-stayle algorithm for regular expression pattern matching. Science of Computer programming 48 (2003)
CS243, Logic and Computation Nondeterministic finite automata
CS243, Prof. Alvarez NONDETERMINISTIC FINITE AUTOMATA (NFA) Prof. Sergio A. Alvarez http://www.cs.bc.edu/ alvarez/ Maloney Hall, room 569 alvarez@cs.bc.edu Computer Science Department voice: (67) 552-4333
More informationAlpha-Beta Pruning: Algorithm and Analysis
Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Introduction Alpha-beta pruning is the standard searching procedure used for solving
More informationSri vidya college of engineering and technology
Unit I FINITE AUTOMATA 1. Define hypothesis. The formal proof can be using deductive proof and inductive proof. The deductive proof consists of sequence of statements given with logical reasoning in order
More informationAn Adaptive Finite-State Automata Application to the problem of Reducing the Number of States in Approximate String Matching
An Adaptive Finite-State Automata Application to the problem of Reducing the Number of States in Approximate String Matching Ricardo Luis de Azevedo da Rocha 1, João José Neto 1 1 Laboratório de Linguagens
More informationAverage Case Analysis of the Boyer-Moore Algorithm
Average Case Analysis of the Boyer-Moore Algorithm TSUNG-HSI TSAI Institute of Statistical Science Academia Sinica Taipei 115 Taiwan e-mail: chonghi@stat.sinica.edu.tw URL: http://www.stat.sinica.edu.tw/chonghi/stat.htm
More informationModule 9: Tries and String Matching
Module 9: Tries and String Matching CS 240 - Data Structures and Data Management Sajed Haque Veronika Irvine Taylor Smith Based on lecture notes by many previous cs240 instructors David R. Cheriton School
More informationLanguages, regular languages, finite automata
Notes on Computer Theory Last updated: January, 2018 Languages, regular languages, finite automata Content largely taken from Richards [1] and Sipser [2] 1 Languages An alphabet is a finite set of characters,
More informationLecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2
BIJU PATNAIK UNIVERSITY OF TECHNOLOGY, ODISHA Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2 Prepared by, Dr. Subhendu Kumar Rath, BPUT, Odisha. UNIT 2 Structure NON-DETERMINISTIC FINITE AUTOMATA
More informationarxiv: v1 [cs.ds] 9 Apr 2018
From Regular Expression Matching to Parsing Philip Bille Technical University of Denmark phbi@dtu.dk Inge Li Gørtz Technical University of Denmark inge@dtu.dk arxiv:1804.02906v1 [cs.ds] 9 Apr 2018 Abstract
More informationAlpha-Beta Pruning: Algorithm and Analysis
Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Introduction Alpha-beta pruning is the standard searching procedure used for 2-person
More informationCPSC 421: Tutorial #1
CPSC 421: Tutorial #1 October 14, 2016 Set Theory. 1. Let A be an arbitrary set, and let B = {x A : x / x}. That is, B contains all sets in A that do not contain themselves: For all y, ( ) y B if and only
More informationAlpha-Beta Pruning: Algorithm and Analysis
Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Introduction Alpha-beta pruning is the standard searching procedure used for 2-person
More informationOn Boyer-Moore Preprocessing
On Boyer-Moore reprocessing Heikki Hyyrö Department of Computer Sciences University of Tampere, Finland Heikki.Hyyro@cs.uta.fi Abstract robably the two best-known exact string matching algorithms are the
More information(a) Definition of TMs. First Problem of URMs
Sec. 4: Turing Machines First Problem of URMs (a) Definition of the Turing Machine. (b) URM computable functions are Turing computable. (c) Undecidability of the Turing Halting Problem That incrementing
More informationChapter 5. Finite Automata
Chapter 5 Finite Automata 5.1 Finite State Automata Capable of recognizing numerous symbol patterns, the class of regular languages Suitable for pattern-recognition type applications, such as the lexical
More informationPattern Matching. a b a c a a b. a b a c a b. a b a c a b. Pattern Matching 1
Pattern Matching a b a c a a b 1 4 3 2 Pattern Matching 1 Outline and Reading Strings ( 9.1.1) Pattern matching algorithms Brute-force algorithm ( 9.1.2) Boyer-Moore algorithm ( 9.1.3) Knuth-Morris-Pratt
More informationThe efficiency of identifying timed automata and the power of clocks
The efficiency of identifying timed automata and the power of clocks Sicco Verwer a,b,1,, Mathijs de Weerdt b, Cees Witteveen b a Eindhoven University of Technology, Department of Mathematics and Computer
More informationCSC236 Week 10. Larry Zhang
CSC236 Week 10 Larry Zhang 1 Today s Topic Deterministic Finite Automata (DFA) 2 Recap of last week We learned a lot of terminologies alphabet string length of string union concatenation Kleene star language
More informationCS 4120 Lecture 3 Automating lexical analysis 29 August 2011 Lecturer: Andrew Myers. 1 DFAs
CS 42 Lecture 3 Automating lexical analysis 29 August 2 Lecturer: Andrew Myers A lexer generator converts a lexical specification consisting of a list of regular expressions and corresponding actions into
More informationCISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata
CISC 4090: Theory of Computation Chapter Regular Languages Xiaolan Zhang, adapted from slides by Prof. Werschulz Section.: Finite Automata Fordham University Department of Computer and Information Sciences
More informationCS 154. Finite Automata, Nondeterminism, Regular Expressions
CS 54 Finite Automata, Nondeterminism, Regular Expressions Read string left to right The DFA accepts a string if the process ends in a double circle A DFA is a 5-tuple M = (Q, Σ, δ, q, F) Q is the set
More informationProving languages to be nonregular
Proving languages to be nonregular We already know that there exist languages A Σ that are nonregular, for any choice of an alphabet Σ. This is because there are uncountably many languages in total and
More informationTheory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is
Abstraction of Problems Data: abstracted as a word in a given alphabet. Σ: alphabet, a finite, non-empty set of symbols. Σ : all the words of finite length built up using Σ: Conditions: abstracted as a
More information2. Exact String Matching
2. Exact String Matching Let T = T [0..n) be the text and P = P [0..m) the pattern. We say that P occurs in T at position j if T [j..j + m) = P. Example: P = aine occurs at position 6 in T = karjalainen.
More informationFinal exam study sheet for CS3719 Turing machines and decidability.
Final exam study sheet for CS3719 Turing machines and decidability. A Turing machine is a finite automaton with an infinite memory (tape). Formally, a Turing machine is a 6-tuple M = (Q, Σ, Γ, δ, q 0,
More informationAnalysis of Algorithms Prof. Karen Daniels
UMass Lowell Computer Science 91.503 Analysis of Algorithms Prof. Karen Daniels Spring, 2012 Tuesday, 4/24/2012 String Matching Algorithms Chapter 32* * Pseudocode uses 2 nd edition conventions 1 Chapter
More informationPushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen
Pushdown Automata Notes on Automata and Theory of Computation Chia-Ping Chen Department of Computer Science and Engineering National Sun Yat-Sen University Kaohsiung, Taiwan ROC Pushdown Automata p. 1
More informationPattern Matching. a b a c a a b. a b a c a b. a b a c a b. Pattern Matching Goodrich, Tamassia
Pattern Matching a b a c a a b 1 4 3 2 Pattern Matching 1 Brute-Force Pattern Matching ( 11.2.1) The brute-force pattern matching algorithm compares the pattern P with the text T for each possible shift
More informationEnhancing Active Automata Learning by a User Log Based Metric
Master Thesis Computing Science Radboud University Enhancing Active Automata Learning by a User Log Based Metric Author Petra van den Bos First Supervisor prof. dr. Frits W. Vaandrager Second Supervisor
More information10. The GNFA method is used to show that
CSE 355 Midterm Examination 27 February 27 Last Name Sample ASU ID First Name(s) Ima Exam # Sample Regrading of Midterms If you believe that your grade has not been recorded correctly, return the entire
More informationNondeterminism. September 7, Nondeterminism
September 7, 204 Introduction is a useful concept that has a great impact on the theory of computation Introduction is a useful concept that has a great impact on the theory of computation So far in our
More informationCP405 Theory of Computation
CP405 Theory of Computation BB(3) q 0 q 1 q 2 0 q 1 1R q 2 0R q 2 1L 1 H1R q 1 1R q 0 1L Growing Fast BB(3) = 6 BB(4) = 13 BB(5) = 4098 BB(6) = 3.515 x 10 18267 (known) (known) (possible) (possible) Language:
More informationarxiv: v3 [cs.fl] 2 Jul 2018
COMPLEXITY OF PREIMAGE PROBLEMS FOR DETERMINISTIC FINITE AUTOMATA MIKHAIL V. BERLINKOV arxiv:1704.08233v3 [cs.fl] 2 Jul 2018 Institute of Natural Sciences and Mathematics, Ural Federal University, Ekaterinburg,
More informationUNIT-III REGULAR LANGUAGES
Syllabus R9 Regulation REGULAR EXPRESSIONS UNIT-III REGULAR LANGUAGES Regular expressions are useful for representing certain sets of strings in an algebraic fashion. In arithmetic we can use the operations
More informationDeterministic Finite Automata (DFAs)
CS/ECE 374: Algorithms & Models of Computation, Fall 28 Deterministic Finite Automata (DFAs) Lecture 3 September 4, 28 Chandra Chekuri (UIUC) CS/ECE 374 Fall 28 / 33 Part I DFA Introduction Chandra Chekuri
More informationTheory of Computation
Thomas Zeugmann Hokkaido University Laboratory for Algorithmics http://www-alg.ist.hokudai.ac.jp/ thomas/toc/ Lecture 10: CF, PDAs and Beyond Greibach Normal Form I We want to show that all context-free
More information1 More finite deterministic automata
CS 125 Section #6 Finite automata October 18, 2016 1 More finite deterministic automata Exercise. Consider the following game with two players: Repeatedly flip a coin. On heads, player 1 gets a point.
More informationComputational Models - Lecture 5 1
Computational Models - Lecture 5 1 Handout Mode Iftach Haitner and Yishay Mansour. Tel Aviv University. April 10/22, 2013 1 Based on frames by Benny Chor, Tel Aviv University, modifying frames by Maurice
More informationDeterministic Finite Automata
Deterministic Finite Automata COMP2600 Formal Methods for Software Engineering Ranald Clouston Australian National University Semester 2, 2013 COMP 2600 Deterministic Finite Automata 1 Pop quiz What is
More informationDeterministic Finite Automata (DFAs)
Algorithms & Models of Computation CS/ECE 374, Fall 27 Deterministic Finite Automata (DFAs) Lecture 3 Tuesday, September 5, 27 Sariel Har-Peled (UIUC) CS374 Fall 27 / 36 Part I DFA Introduction Sariel
More informationKleene Algebras and Algebraic Path Problems
Kleene Algebras and Algebraic Path Problems Davis Foote May 8, 015 1 Regular Languages 1.1 Deterministic Finite Automata A deterministic finite automaton (DFA) is a model of computation that can simulate
More informationTasks of lexer. CISC 5920: Compiler Construction Chapter 2 Lexical Analysis. Tokens and lexemes. Buffering
Tasks of lexer CISC 5920: Compiler Construction Chapter 2 Lexical Analysis Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Copyright Arthur G. Werschulz, 2017. All
More informationConfusion of Memory. Lawrence S. Moss. Department of Mathematics Indiana University Bloomington, IN USA February 14, 2008
Confusion of Memory Lawrence S. Moss Department of Mathematics Indiana University Bloomington, IN 47405 USA February 14, 2008 Abstract It is a truism that for a machine to have a useful access to memory
More informationPS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010
University of Virginia - cs3102: Theory of Computation Spring 2010 PS2 - Comments Average: 77.4 (full credit for each question is 100 points) Distribution (of 54 submissions): 90, 12; 80 89, 11; 70-79,
More informationString Matching with Variable Length Gaps
String Matching with Variable Length Gaps Philip Bille, Inge Li Gørtz, Hjalte Wedel Vildhøj, and David Kofoed Wind Technical University of Denmark Abstract. We consider string matching with variable length
More informationTuring Machines, diagonalization, the halting problem, reducibility
Notes on Computer Theory Last updated: September, 015 Turing Machines, diagonalization, the halting problem, reducibility 1 Turing Machines A Turing machine is a state machine, similar to the ones we have
More informationWhat we have done so far
What we have done so far DFAs and regular languages NFAs and their equivalence to DFAs Regular expressions. Regular expressions capture exactly regular languages: Construct a NFA from a regular expression.
More informationSeptember 11, Second Part of Regular Expressions Equivalence with Finite Aut
Second Part of Regular Expressions Equivalence with Finite Automata September 11, 2013 Lemma 1.60 If a language is regular then it is specified by a regular expression Proof idea: For a given regular language
More informationFORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
5-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY NON-DETERMINISM and REGULAR OPERATIONS THURSDAY JAN 6 UNION THEOREM The union of two regular languages is also a regular language Regular Languages Are
More informationDecentralized Control of Discrete Event Systems with Bounded or Unbounded Delay Communication
Decentralized Control of Discrete Event Systems with Bounded or Unbounded Delay Communication Stavros Tripakis Abstract We introduce problems of decentralized control with communication, where we explicitly
More informationAutomata and Computability. Solutions to Exercises
Automata and Computability Solutions to Exercises Spring 27 Alexis Maciel Department of Computer Science Clarkson University Copyright c 27 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata
More informationAutomata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,
Automata & languages A primer on the Theory of Computation Laurent Vanbever www.vanbever.eu ETH Zürich (D-ITET) September, 24 2015 Last week was all about Deterministic Finite Automaton We saw three main
More information(Refer Slide Time: 0:21)
Theory of Computation Prof. Somenath Biswas Department of Computer Science and Engineering Indian Institute of Technology Kanpur Lecture 7 A generalisation of pumping lemma, Non-deterministic finite automata
More informationFinite Automata. Mahesh Viswanathan
Finite Automata Mahesh Viswanathan In this lecture, we will consider different models of finite state machines and study their relative power. These notes assume that the reader is familiar with DFAs,
More informationSpace-aware data flow analysis
Space-aware data flow analysis C. Bernardeschi, G. Lettieri, L. Martini, P. Masci Dip. di Ingegneria dell Informazione, Università di Pisa, Via Diotisalvi 2, 56126 Pisa, Italy {cinzia,g.lettieri,luca.martini,paolo.masci}@iet.unipi.it
More informationBefore we show how languages can be proven not regular, first, how would we show a language is regular?
CS35 Proving Languages not to be Regular Before we show how languages can be proven not regular, first, how would we show a language is regular? Although regular languages and automata are quite powerful
More informationRecognizing Safety and Liveness by Alpern and Schneider
Recognizing Safety and Liveness by Alpern and Schneider Calvin Deutschbein 17 Jan 2017 1 Intro 1.1 Safety What is safety? Bad things do not happen For example, consider the following safe program in C:
More informationIntroduction to Theory of Computing
CSCI 2670, Fall 2012 Introduction to Theory of Computing Department of Computer Science University of Georgia Athens, GA 30602 Instructor: Liming Cai www.cs.uga.edu/ cai 0 Lecture Note 3 Context-Free Languages
More informationFooling Sets and. Lecture 5
Fooling Sets and Introduction to Nondeterministic Finite Automata Lecture 5 Proving that a language is not regular Given a language, we saw how to prove it is regular (union, intersection, concatenation,
More informationCSC236 Week 11. Larry Zhang
CSC236 Week 11 Larry Zhang 1 Announcements Next week s lecture: Final exam review This week s tutorial: Exercises with DFAs PS9 will be out later this week s. 2 Recap Last week we learned about Deterministic
More informationCS 455/555: Finite automata
CS 455/555: Finite automata Stefan D. Bruda Winter 2019 AUTOMATA (FINITE OR NOT) Generally any automaton Has a finite-state control Scans the input one symbol at a time Takes an action based on the currently
More informationKnuth-Morris-Pratt Algorithm
Knuth-Morris-Pratt Algorithm Jayadev Misra June 5, 2017 The Knuth-Morris-Pratt string matching algorithm (KMP) locates all occurrences of a pattern string in a text string in linear time (in the combined
More information2. Elements of the Theory of Computation, Lewis and Papadimitrou,
Introduction Finite Automata DFA, regular languages Nondeterminism, NFA, subset construction Regular Epressions Synta, Semantics Relationship to regular languages Properties of regular languages Pumping
More informationLecture 3: Nondeterministic Finite Automata
Lecture 3: Nondeterministic Finite Automata September 5, 206 CS 00 Theory of Computation As a recap of last lecture, recall that a deterministic finite automaton (DFA) consists of (Q, Σ, δ, q 0, F ) where
More informationHKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed
HKN CS/ECE 374 Midterm 1 Review Nathan Bleier and Mahir Morshed For the most part, all about strings! String induction (to some extent) Regular languages Regular expressions (regexps) Deterministic finite
More informationEfficient Sequential Algorithms, Comp309
Efficient Sequential Algorithms, Comp309 University of Liverpool 2010 2011 Module Organiser, Igor Potapov Part 2: Pattern Matching References: T. H. Cormen, C. E. Leiserson, R. L. Rivest Introduction to
More informationComputational Theory
Computational Theory Finite Automata and Regular Languages Curtis Larsen Dixie State University Computing and Design Fall 2018 Adapted from notes by Russ Ross Adapted from notes by Harry Lewis Curtis Larsen
More informationCS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,
CS 54, Lecture 2: Finite Automata, Closure Properties Nondeterminism, Why so Many Models? Streaming Algorithms 0 42 Deterministic Finite Automata Anatomy of Deterministic Finite Automata transition: for
More informationUNIT-VIII COMPUTABILITY THEORY
CONTEXT SENSITIVE LANGUAGE UNIT-VIII COMPUTABILITY THEORY A Context Sensitive Grammar is a 4-tuple, G = (N, Σ P, S) where: N Set of non terminal symbols Σ Set of terminal symbols S Start symbol of the
More informationComputer Sciences Department
1 Reference Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER 3 objectives Finite automaton Infinite automaton Formal definition State diagram Regular and Non-regular
More informationJohns Hopkins Math Tournament Proof Round: Automata
Johns Hopkins Math Tournament 2018 Proof Round: Automata February 9, 2019 Problem Points Score 1 10 2 5 3 10 4 20 5 20 6 15 7 20 Total 100 Instructions The exam is worth 100 points; each part s point value
More informationAlgorithms for pattern involvement in permutations
Algorithms for pattern involvement in permutations M. H. Albert Department of Computer Science R. E. L. Aldred Department of Mathematics and Statistics M. D. Atkinson Department of Computer Science D.
More informationFall 1999 Formal Language Theory Dr. R. Boyer. 1. There are other methods of nding a regular expression equivalent to a nite automaton in
Fall 1999 Formal Language Theory Dr. R. Boyer Week Four: Regular Languages; Pumping Lemma 1. There are other methods of nding a regular expression equivalent to a nite automaton in addition to the ones
More informationDynamic Noninterference Analysis Using Context Sensitive Static Analyses. Gurvan Le Guernic July 14, 2007
Dynamic Noninterference Analysis Using Context Sensitive Static Analyses Gurvan Le Guernic July 14, 2007 1 Abstract This report proposes a dynamic noninterference analysis for sequential programs. This
More informationCSC173 Workshop: 13 Sept. Notes
CSC173 Workshop: 13 Sept. Notes Frank Ferraro Department of Computer Science University of Rochester September 14, 2010 1 Regular Languages and Equivalent Forms A language can be thought of a set L of
More informationMin/Max-Poly Weighting Schemes and the NL vs UL Problem
Min/Max-Poly Weighting Schemes and the NL vs UL Problem Anant Dhayal Jayalal Sarma Saurabh Sawlani May 3, 2016 Abstract For a graph G(V, E) ( V = n) and a vertex s V, a weighting scheme (w : E N) is called
More informationAutomata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,
Automata & languages A primer on the Theory of Computation Laurent Vanbever www.vanbever.eu ETH Zürich (D-ITET) October, 5 2017 Part 3 out of 5 Last week, we learned about closure and equivalence of regular
More informationPart 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages
Automata & languages A primer on the Theory of Computation Laurent Vanbever www.vanbever.eu Part 3 out of 5 ETH Zürich (D-ITET) October, 5 2017 Last week, we learned about closure and equivalence of regular
More informationAutomata and Computability. Solutions to Exercises
Automata and Computability Solutions to Exercises Fall 28 Alexis Maciel Department of Computer Science Clarkson University Copyright c 28 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata
More informationLecture 14 - P v.s. NP 1
CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) February 27, 2018 Lecture 14 - P v.s. NP 1 In this lecture we start Unit 3 on NP-hardness and approximation
More informationPart 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX
Automata & languages A primer on the Theory of Computation Laurent Vanbever www.vanbever.eu Part 4 out of 5 ETH Zürich (D-ITET) October, 12 2017 Last week, we showed the equivalence of DFA, NFA and REX
More informationIntroduction to Turing Machines. Reading: Chapters 8 & 9
Introduction to Turing Machines Reading: Chapters 8 & 9 1 Turing Machines (TM) Generalize the class of CFLs: Recursively Enumerable Languages Recursive Languages Context-Free Languages Regular Languages
More informationOn improving matchings in trees, via bounded-length augmentations 1
On improving matchings in trees, via bounded-length augmentations 1 Julien Bensmail a, Valentin Garnero a, Nicolas Nisse a a Université Côte d Azur, CNRS, Inria, I3S, France Abstract Due to a classical
More informationFinite Automata Part One
Finite Automata Part One Computability Theory What problems can we solve with a computer? What kind of computer? Computers are Messy http://en.wikipedia.org/wiki/file:eniac.jpg Computers are Messy That
More informationTheory of Computation Prof. Raghunath Tewari Department of Computer Science and Engineering Indian Institute of Technology, Kanpur
Theory of Computation Prof. Raghunath Tewari Department of Computer Science and Engineering Indian Institute of Technology, Kanpur Lecture 10 GNFA to RE Conversion Welcome to the 10th lecture of this course.
More informationCS 275 Automata and Formal Language Theory
CS 275 Automata and Formal Language Theory Course Notes Part II: The Recognition Problem (II) Chapter II.4.: Properties of Regular Languages (13) Anton Setzer (Based on a book draft by J. V. Tucker and
More informationCMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata
: Organization of Programming Languages Theory of Regular Expressions Finite Automata Previous Course Review {s s defined} means the set of string s such that s is chosen or defined as given s A means
More informationMinimization Techniques for Symbolic Automata
University of Connecticut OpenCommons@UConn Honors Scholar Theses Honors Scholar Program Spring 5-1-2018 Minimization Techniques for Symbolic Automata Jonathan Homburg jonhom1996@gmail.com Follow this
More informationCSci 311, Models of Computation Chapter 4 Properties of Regular Languages
CSci 311, Models of Computation Chapter 4 Properties of Regular Languages H. Conrad Cunningham 29 December 2015 Contents Introduction................................. 1 4.1 Closure Properties of Regular
More informationLecture 2: Connecting the Three Models
IAS/PCMI Summer Session 2000 Clay Mathematics Undergraduate Program Advanced Course on Computational Complexity Lecture 2: Connecting the Three Models David Mix Barrington and Alexis Maciel July 18, 2000
More informationDiscrete Event Systems Exam
Computer Engineering and Networks Laboratory TEC, NSG, DISCO HS 2016 Prof. L. Thiele, Prof. L. Vanbever, Prof. R. Wattenhofer Discrete Event Systems Exam Friday, 3 rd February 2017, 14:00 16:00. Do not
More informationDeterministic Finite Automata (DFAs)
Algorithms & Models of Computation CS/ECE 374, Spring 29 Deterministic Finite Automata (DFAs) Lecture 3 Tuesday, January 22, 29 L A TEXed: December 27, 28 8:25 Chan, Har-Peled, Hassanieh (UIUC) CS374 Spring
More informationProbabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford
Probabilistic Model Checking Michaelmas Term 2011 Dr. Dave Parker Department of Computer Science University of Oxford Probabilistic model checking System Probabilistic model e.g. Markov chain Result 0.5
More informationCompilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam
Compilers Lecture 3 Lexical analysis Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Big picture Source code Front End IR Back End Machine code Errors Front end responsibilities Check
More informationLecture 3. 1 Terminology. 2 Non-Deterministic Space Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, 2005.
Notes on Complexity Theory: Fall 2005 Last updated: September, 2005 Jonathan Katz Lecture 3 1 Terminology For any complexity class C, we define the class coc as follows: coc def = { L L C }. One class
More informationProclaiming Dictators and Juntas or Testing Boolean Formulae
Proclaiming Dictators and Juntas or Testing Boolean Formulae Michal Parnas The Academic College of Tel-Aviv-Yaffo Tel-Aviv, ISRAEL michalp@mta.ac.il Dana Ron Department of EE Systems Tel-Aviv University
More information6.841/18.405J: Advanced Complexity Wednesday, February 12, Lecture Lecture 3
6.841/18.405J: Advanced Complexity Wednesday, February 12, 2003 Lecture Lecture 3 Instructor: Madhu Sudan Scribe: Bobby Kleinberg 1 The language MinDNF At the end of the last lecture, we introduced the
More informationLet us first give some intuitive idea about a state of a system and state transitions before describing finite automata.
Finite Automata Automata (singular: automation) are a particularly simple, but useful, model of computation. They were initially proposed as a simple model for the behavior of neurons. The concept of a
More informationAutomata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS
Automata Theory Lecture on Discussion Course of CS2 This Lecture is about Mathematical Models of Computation. Why Should I Care? - Ways of thinking. - Theory can drive practice. - Don t be an Instrumentalist.
More informationAutomata and Computability
Automata and Computability Fall 207 Alexis Maciel Department of Computer Science Clarkson University Copyright c 207 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata 5 2. Turing Machines...............................
More information