Finite Automata Approach to Computing All Seeds of Strings with the Smallest Hamming Distance

Size: px
Start display at page:

Download "Finite Automata Approach to Computing All Seeds of Strings with the Smallest Hamming Distance"

Transcription

1 IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_05 Finite Automt Approch to Computing All Seeds of Strings with the Smllest Hmming istnce Ondřej Guth, Bořivoj Melichr Astrct Seed is type of regulrity of strings. A restricted pproximte seed w of string T is fctor of T such tht w covers superstring of T under some distnce rule. In this pper, the prolem of ll restricted seeds with the smllest Hmming distnce is studied nd polynomil time nd spce lgorithm for solving the prolem is presented. It serches for ll restricted pproximte seeds of string with given limited pproximtion using Hmming distnce nd it computes the smllest distnce for ech found seed. The solution is sed on finite (suffix) utomt pproch tht provides strightforwrd wy to design lgorithms to mny prolems in stringology. Therefore, it is shown tht the set of prolems solvle using finite utomt includes the one studied in this pper. Keywords: pproximte seed, suffix utomton, Hmming distnce, stringology 1 Introduction Serching regulrities of strings is used in wide re of pplictions like moleculr iology, computer ssisted music nlysis, or dt compression. By regulrities, repeted strings re ment. Exmples of regulrities include repetitions, orders, periods, covers, nd seeds. The lgorithm for computing ll exct seeds of string ws introduced y Iliopoulos, Moore, nd Prk [1]. The first lgorithm for serching ll seeds using finite utomt ws introduced y Voráček nd Melichr [2]. Finding exct regulrities is not lwys sufficient nd thus some kind of pproximtion is used. An lgorithm for serching pproximte periods, covers, nd seeds under Hmming, Levenshtein (lso clled edit), nd weighted Levenshtein distnce ws presented y Christodoulkis, Iliopoulos, Prk, nd Sim [3]. The lgorithm for computing pproximte seeds ws originlly introduced y these uthors in [4]. An lgorithm for serching ll covers under Hmming, Levenshtein, nd meru distnce using finite utomt ws introduced y Guth [5], optimized lgorithm for computing ll covers with smllest Hmming distnce ws presented y Guth, Melichr, nd Blík [6]. Czech Technicl University in Prgue, Fculty of Electricl Engineering, eprtment of Computer Science nd Engineering. Emil: {gutho1,melichr}@fel.cvut.cz Finite utomt provide common formlism for mny lgorithms in the re of text processing (stringology), involving forwrd exct nd pproximte pttern mtching nd serching for orders, periods, nd repetitions [7], ckwrd pttern mtching [8], pttern mtching in compressed text [9], the longest common susequence [10], exct nd pproximte 2 pttern mtching [11], nd lredy mentioned computing pproximte covers [5, 6] nd exct covers [12] nd seeds [2] in generlized strings. Therefore, we would like to further extend the set of prolems solved using finite utomt. Such prolem is studied in this pper. Finite utomton s dt structure my e esily implemented. Therefore, using it s se for similr pproch to mny lgorithms is not only theoreticl prolem, s it my mke development of softwre relted with ove mentioned res esier, fster, nd cost-reduced. This pper is orgnized s follows: in Section 2, sic definitions nd previous works overview re plced. In Section 3, the lgorithm for the prolem eing studied is presented. Its theoreticl time nd spce complexity is derived in Section 4 nd experimentl results re shown in Section 5. 2 Preliminries An lphet is nonempty finite set of symols, denoted y A. The symol of the lphet is denoted y. A string over n lphet is finite sequence of symols of the lphet. Hving string T = 1, 2,..., T, reversed string T is denoted y T R nd it is equl to T R = T, T 1,..., 1. Empty string is n empty sequence of symols, denoted y ε. An effective lphet of string T is set of symols tht occur in T, denoted y A T. Only effective lphet is considered in this pper. A lnguge is set of strings. A set of ll strings over lphet A is denoted y A. The length of string w is denoted y w, the i th symol of w is denoted y w[i]. Result of n opertion conctention of strings x, y A is equl to xy, it my e denoted y x.y. An opertion superposition is defined in this wy: x = pu, y = us, superposition of x nd y is pus. A distnce is the minimum numer of editing opertions necessry to convert string x into string y. The mximum (Advnce online puliction: 22 My 2009)

2 IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_05 llowed distnce is denoted y k. The Hmming distnce etween strings x nd y, denoted y H, is equl to the minimum numer of editing opertions replce (of one symol) tht re necessry to convert x into y. Only Hmming distnce is considered in this pper. Suppose p, p, s, s, u, w, x, T A. p is prefix of T if T = pu, s is suffix of T if T = us, w is fctor (lso clled sustring) of T if T = uwx (lso T is superstring of w). Set of ll fctors of T is denoted y Fct(T). p is n pproximte prefix of T with mximum Hmming distnce k if T = pu nd H (p, p ) k. Set of ll pproximte prefixes of T with mximum Hmming distnce k is denoted y Pref k (T). An pproximte suffix nd set of ll pproximte suffixes of T with respect to k is defined y nlogy nd is denoted y Suff k (T). We sy tht string w occurs in string T if w Fct(T). Fctor w occurs t position (end-position) i in string T if j {1,..., w } : w[j] = T[i w + j]. An end-set is set of ll i such tht w occurs t position i in T. String w occurs pproximtely with mximum Hmming distnce k t position i in string T (or w hs pproximte occurrence t position i in T) if there exists fctor x of T tht occurs t position i in T nd H (x, w) k. String w is cover of T if T cn e constructed y conctentions nd superpositions of w. We lso sy tht w covers T. String w is seed of T if w covers some superstring of T. For exmple, nd re some seeds of. String w is restricted pproximte cover of T with mximum Hmming distnce k if w is fctor of T nd there exist strings s 1, s 2,..., s r ; s i Fct(T) such tht: 1. H (w, s i ) k for ll i where 1 i r, 2. T cn e constructed y superpositions nd conctentions of copies of the strings s 1, s 2,...,s r. String w is restricted pproximte seed of string T with mximum Hmming distnce k if w is fctor of T nd w is restricted pproximte cover of some superstring of T with mximum Hmming distnce k. The smllest Hmming distnce of restricted pproximte seed w of string T is the smllest possile integer l m such tht w is restricted pproximte seed of T with mximum Hmming distnce l m. A finite utomton M (lso clled finite stte mchine) is quintuple M = (Q, A, δ, q 0, F), where Q is nonempty finite set of sttes, A is n input lphet, δ is trnsition function, q 0 Q is n initil stte nd F Q is set of finl sttes. Finite utomton is deterministic (revited to FA) if its trnsition function is δ : Q A Q, i.e. for ech pir of stte q i nd symol, there exists t most one stte q j such tht δ(q i, ) = q j. FA is prtil if there my exist pir of stte q i nd symol such tht δ(q i, ) is undefined. In this pper, prtil FA re considered in generl. A deterministic trie is FA tht my hve its trnsition digrm represented s tree, i.e. for ech stte q j, there exists t most one stte q i such tht for ny symol A, δ(q i, ) = q j. Finite utomton is nondeterministic (revited to NFA) if its trnsition function is δ : Q A P(Q), i.e. for some pir of stte q i nd symol, there my exist more thn one stte q j such tht q j δ(q i, ). A successor q j of stte q i for symol is stte from result of trnsition function, i.e. q j δ(q i, ) for NFA nd q j = δ(q i, ) for FA. An extended trnsition function denoted y δ is for FA defined for A, u A in this wy: δ (q, ε) = q, δ (q, u) = δ(δ (q, u), ) An extended trnsition function of n NFA is defined s: δ (q, ε) = {q}, δ (q, u) = δ (q i, u) q i δ(q,) String w is ccepted y FA when δ (q 0, w) = q for initil stte q 0 nd some finl stte q. String w is ccepted y n NFA when q δ (q 0, w) for initil stte q 0 nd some finl stte q. We lso sy tht the utomton ccepts string w. Automton M 1 is equivlent to utomton M 2 if M 1 nd M 2 ccept equl sets of strings. A left lnguge of stte q of FA is set of ll strings w for tht holds δ (q 0, w) = q for initil stte q 0. Left lnguge of stte q of trie contins one string, denoted y fctor(q). A nondeterministic suffix utomton for string T nd mximum Hmming distnce k, denoted y M k S N (T), is n NFA tht ccepts ll strings from Suff k (T) (see Figure 3 for n exmple of such utomton). Such n utomton M k S N (T) = (Q, A T, δ, q 0, F) my e constructed in this wy: 1. Crete lyer of T + 1 sttes: () ech stte q 0 i corresponds to position i in T (plus initil stte q 0, thus 0 < i T ), () for ech stte qi 0 (ut the lst q0 T ) define trnsition δ(qi 0, T[i]) = q0 i+1, (c) define the lst stte q T 0 finl (note tht until now such utomton ccepts exctly T). 2. Similrly, crete lyer for ech numer of errors l, 1 l k (the only exception: we do not need ny stte qi l for l > i). 3. For ech stte qi l (ut the lst q T in ech lyer nd ut the lst lyer) nd for ech symol A T, T[i] (not occurring in T t position i), define trnsition δ(qi l, T[i]) = ql+1 i Crete long trnsitions from q 0 : δ(q 0, ) = {qi 0 : = T[i], 1 i T } {qi 1 : T[i], 1 i T }. (Advnce online puliction: 22 My 2009)

3 IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_05 A level of stte of M k S N (T) corresponds to the numer of errors (order l of the lyer mentioned ove), depth of stte of this utomton is equl to the corresponding position in T (numer i mentioned ove). A deterministic suffix utomton for string T nd mximum Hmming distnce k, denoted y M k S (T), is FA tht ccepts ll strings from Suff k (T) (see Figure 5). A depth of stte q of the utomton is length of the longest string w such tht δ (q 0, w) = q for initil stte q 0. eterministic utomton M k S (T) = (Q, A T, δ, q 0, F) ccepting the sme lnguge s nondeterministic utomton M k S N (T) = (Q N, A T, δ N, q N 0, F N) my e creted using suset construction: 1. Set Q = {{q 0 }} will e defined, stte q 0 = {q N 0 } will e treted s unmrked. 2. If ech stte in Q is mrked, continue with step Unmrked stte q will e chosen from Q nd the following opertions will e executed: () δ(q, ) = δ N (r, ) for r q nd for ll A T, () Q = Q δ(q, ) for ll A T, (c) stte q Q will e mrked, (d) continue with step F = {q : q Q, r F N, r q}. Using suset construction of FA M k S (T) equivlent to NFA M k S N (T), every stte q Q corresponds to some suset of Q N. This suset is clled d suset (revition of deterministic suset), denoted y d(q ). Ech element of the d suset corresponds to some stte of Q N. Where no confusion rises, depth of stte corresponding to n element r j d(q ) of d suset d(q ) is simply denoted y r j. Note tht considering only depths of elements, ny d suset d(q) is equl to the end-set of ll strings from left lnguge of q. In the lgorithms elow, d suset is supposed to e implemented s list, preserving order of its elements. An element of the d suset is denoted y r i, where the suscript i mens n index (order) of the element r i within the d suset. In figures, sttes of nondeterministic utomt nd elements of d susets of deterministic utomt re denoted y their depths nd levels, e.g. 3 mens stte or element with depth 3 nd level 2. Prolem formultion (All restricted seeds with the smllest Hmming distnce). Given string T nd mximum Hmming distnce k, find ll restricted pproximte seeds of T with respect to k nd compute their smllest distnces. The lgorithm for serching (exct) seeds in generlized strings presented in [2] oviously works for (nongenerlized) strings s well, ecuse string is specil cse of generlized string. It is sed on the following ide. First, M S N(T) is constructed. Equivlent deterministic utomton M S (T) = (Q, A T, δ, q 0, F) is computed using suset construction. One of conditions for ny fctor to e seed of string T is its length. Seed w must cover centrl prt of T (i.e. the prt of T etween the leftmost nd the rightmost position of w within T), nd it must cover the uncovered suffix of T nd the uncovered prefix of T. All sufficiently long fctors re then checked whether they cover the uncovered suffix, prefix of T, respectively. If M S (T) ccepts some prefix of fctor w then w covers uncovered suffix of T. If suffix utomton M S (T R ) for reversed string T ccepts some prefix of reversed fctor w then w covers uncovered prefix of T. When w stisfies ll the conditions, w is seed of T. Computtion of the smllest Hmming distnce of cover (presented in [6]) is sed on the following ide: when the mximum pproximtion of the first nd the lst position of cover w in T is l min, for its smllest distnce l m holds l m l min, ecuse cover is n pproximte prefix nd suffix of T nd thus it cnnot cover T without its first nd lst position. When cover w of T hs positions with pproximtion t most l, for its smllest distnce l m clerly holds l m l. When the positions of w with the mximum pproximtion equl to l re no longer considered (the first nd the lst position must e still considered) nd w is still cover of T, then for l m holds l m l 1. l m is decremented till w still covers T. This my e used with modifictions for computtion of seeds. The lgorithm for serching exct seeds from [2] uses two phses: first, deterministic suffix utomton is constructed nd then d susets re nlyzed nd seeds re computed. This mens tht complete utomton or t lest ll the d susets to e nlyzed need to e stored in memory t time. By contrst, the lgorithm for serching covers from [6] uses merge of the phses, ech d suset is nlyzed just fter its construction. A depthfirst serch like lgorithm is used nd the sttes tht re no longer needed re removed. For pproximte seeds serching, there is lso no need to store ll elements of d susets of the utomton in memory t time. 3 Prolem solution Some properties re common for exct nd pproximte seeds with Hmming distnce. Hence the lgorithm presented in [2] is used s se of lgorithm for the prolem studied in this pper, using some (ut not ll) techniques for serching covers in [6] for further improvements. Every pproximte restricted seed of string T is necessrily n exct fctor of T with other possile pproximte occurrences. Suffix utomton constructed for T nd mximum Hmming distnce k hs extended trnsitions defined for ll fctors of T with respect to k. When M k S (T) is constructed using suset construction from (Advnce online puliction: 22 My 2009)

4 IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_05 M k S N (T), ech element of ny d suset of ny stte of M k S (T) contins informtion not only out position (depth within M k S N (T)) ut lso out pproximtion (level within M k S N (T)). Therefore, it my e esily determined whether string from left lnguge of ny stte of M k S (T) is n exct fctor of T. Note 1. For string T, string v such tht v is not fctor of T, nd ny string u eing superstring of v holds: T, u, v A, v Fct(u) : v / Fct(T) u / Fct(T) Lemm 1. For FA M k S (T) = (Q, A T, δ, q 0, F ) creted using suset construction from M k S N (T) nd for its stte q i Q with d suset d(q i ) such tht r d(q i ) : level(r) > 0 holds tht ny successor of q i cnnot contin element r j such tht level(r j ) = 0. Proof. It holds from the following property of trnsitions of M k S N (T) = (Q N, A T, δ N, q N 0, F N ): for ll successors r j Q N of r i Q N holds tht level(r i ) level(r j ). Corollry. As only exct fctor of T my e restricted seed of T, there is no need to construct ny stte of M k S (T) hving only non-zero-level elements in its d suset, s such stte contins no exct fctor of T in its left lnguge. Therefore, when such stte is creted during construction, it my e removed nd ny of its successors need not e constructed. Such deterministic suffix utomton tht contins only sttes hving t lest one zero-level element in its d suset is denoted y M k S (T). Note 2. Specil type of deterministic suffix utomton, suffix trie, is considered in this pper. Construction of the trie nd left lnguge extrction is simpler thn for generl suffix utomton. As left lnguge of ny stte of the trie contins exctly one string, extrction of left lnguge of ny stte tkes liner time with respect to length of the string (e.g. using inverted trnsition function). See Figure 5 for exmple of suffix trie. The reltion etween length nd positions of ny seed (presented in [2]) holds lso for pproximte positions with Hmming distnce, s the distnce is defined for strings of equl lengths only. Note 3. When serching for covers with Hmming distnce [6], it is possile to remove ll sttes q of deterministic suffix trie tht do not represent prefix, i.e. such q tht fctor(q) < depth(r 1 ), where d(q) = r 1,...,r d(q). Similr property etween the first position nd length of fctor is used for serching seeds: fctor(q) depth(r 1) 2. Unlike in computing covers, this condition cnnot e used for removing sttes q nd their successors. Exmple 1. Let us consider suffix trie for string T = nd mximum Hmming distnce k = 2. Fctor cnnot e seed of T s its first pproximte position within T is 6. Fctor is seed of T with respect to k. It is ovious tht for sttes q 1, q 2 of the trie, Figure 1: Possile covering of string with string nd Hmming distnce 2 from Exmple 2 Figure 2: Possile covering of superstring of with nd Hmming distnce 1 from Exmple 2 where fctor(q 1 ) = nd fctor(q 2 ) =, holds: q 2 is successor of stte tht is successor of q 1. Therefore, q 1 must not e removed to e le to find. For computtion of the smllest distnce l m of ech seed, the ide used for serching covers ([6]) my e used for serching seeds. Unlike serching covers, ny position my e removed, including the first nd the lst, thus the only lower ound of l m is 0. etermintion whether continue to decrement l is for seeds more complex thn for covers, s computtion of covering of centrl prt of T is not sufficient condition for seeds. For fctor w of T, not only positions nd their pproximtion need to e considered, ut lso distnce of the uncovered prefix, suffix, of T, nd some suffix, prefix, of w, respectively. See Algorithm 3 for further informtion. Exmple 2. Let us hve string T = nd mximum Hmming distnce k = 2. One seed of T with respect to k is. It my e seed of T with Hmming distnce 2, ecuse its positions in T re 4, 5, 6, 7, nd 8 with mximum pproximtion 2 (see Figures 1, 5). When the position 8 with pproximtion 2 is removed, is still seed of T with positions 4, 5, 6, nd 7, ll with pproximtion t most 1 (see Figure 2). The deterministic suffix trie is needed not only to determine positions of ech fctor w of T, ut lso for checking whether w is le to cover uncovered prefix nd suffix of T (see Algorithm 4). Thus, the trie must e le to ccept strings of length t lest w 1. Therefore, the depthfirst serch with removing sttes from [6] cnnot e used. By contrst, only elements of d suset d(q) my e removed fter construction of ll successors of q, trnsitions must e preserved. Thus, redth-first serch in the utomton is used (see Algorithm 1 nd usge of queues (Advnce online puliction: 22 My 2009)

5 IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_05 L, L R ). As no stte of trie M k S (T) is removed nd the lst element of ech d suset is preserved, it is possile to recognize ll pproximte suffixes of T of length t lest w 1 nd their distnce. Like n exct seed, n pproximte one must lso cover the uncovered prefix nd suffix of T (i.e. some prefix of seed w must e n pproximte suffix of T nd some suffix of w must e n pproximte prefix of T). Similr technique s for exct seeds ([2]) is used (Algorithm 4), ut with tries M k S (T) nd M k S (T R ). When some suffix of seed w of T (i.e. some prefix of reversed w) is ccepted y M k S (T R ), i.e. w = fctor(q) for some finl stte q of Mk S (T R ), w covers the uncovered prefix of T with pproximtion equl to level of the lst element r d(q) of d(q). Similrly for prefix of w, the uncovered suffix of T nd M k S (T). For complete solution of the prolem see Algorithm 1. Algorithm 2 Compute stte of deterministic suffix trie M = Q, A T, δ, q 0, F. Input: NSA (Q N, A T, δ N, q N 0, F N), stte q t Q, symol A T, queue L of sttes. Output: Modified M with possily dded successor q u of stte q t for symol, modified queue L. 1: crete new stte q u 2: define depth(q u ) = depth(q t ) + 1 3: for ll r i d(q t ) (in order s stored in d(q t )) do 4: ppend ll r j δ N (r i, ) to d(q u ) in scending order y depth(r j ) 5: end for 6: if exists r d(q u ) where level(r) = 0 then 7: Q Q {q u } 8: enqueue(l, q u ) 9: if r u d(q u ) F N, d(q u ) = r u 1,...ru d(q u ) then 10: F F {q u } 11: end if 12: end if Algorithm 3 The smllest distnce of seed of T. Input: d suset d(q) = r 1, r 2,...,r d(q) representing seed w of T. Output: The smllest distnce l m of w. 1: t d(q) 2: l mx mx r t {level(r)} 3: l l mx 4: repet 5: for ll r t : level(r) = l do 6: remove r from t 7: end for 8: l l 1 9: until w is seed of T using positions determined y t with respect to l (Algorithm 4) 10: l m l Figure 3: Trnsition digrm of the nondeterministic pproximte suffix utomton M k S N (T) for string T = nd mximum Hmming distnce k = 2 from Exmple 3 Exmple 3. Let us compute set of ll seeds with mximum Hmming distnce k = 2 for string T =. Nondeterministic suffix utomt M k S N (T) (see Figure 3) nd M k S N (T R ) (see Figure 4) re constructed. Next, suset construction of deterministic suffix trie M k S (T) from M k S N (T) strts stte-y-stte (see trnsition digrm of M k S (T) with ll sttes, tht need to e constructed, t Figure 5), the sme is done with trie M k S (T R ) from M k S N (T R ). Some sttes my hve only elements with non-zero level in its d suset (e.g. 7 8 ). Such sttes re removed nd their successors re not constructed s strings from their left lnguges (e.g. ) re not fctors of T (follows y Corollry of Lemm 1). All other sttes need to e checked whether their left lnguges contin some seeds. For exmple, stte with d suset contins string in its left lnguge. The string occurs pproximtely t positions 6, 7 in T nd exctly t position 8 in T, thus it cnnot e seed of T, s its leftmost occurrence within T ends t position 6 nd its length is 3 (i.e. the occurrence strts t position 4), so ny proper suffix of cnnot cover the uncovered prefix (positions 1 to 3) of T. Other exmple is stte with d suset , which contins string in its left lnguge. This string covers T with Hmming distnce 2, nd therefore it is seed of T (see Figure 1). When ll positions with the mximum distnce (i.e. 8) re not considered, is still seed of T, s proper prefix of covers uncovered suffix of T with Hmming distnce 1 (see Figure 2). See resulting tle of ll seeds nd their distnces in Tle 1. 4 Time nd spce complexities Note 4. As prts of M k S (T) nd M k S (T R ) re constructed the sme wy in Algorithm 1, the time nd spce complexities of their construction re the sme. Lemm 2. Left lnguges of sttes of deterministic suffix trie M k S (T) = (Q, A T, δ, q 0, F ) re distinct, i.e. q 1, q 2 Q ; q 1 q 2 fctor(q 1 ) fctor(q 2 ) (Advnce online puliction: 22 My 2009)

6 IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_05 Algorithm 1 Compute set of seeds of T with the smllest Hmming distnces. Input: String T, mximum Hmming distnce k. Output: Set hseeds k (T) of ll seeds of T. 1: hseeds k (T) 2: construct MS k (T) = (Q N N, A T, δ N, q0 N, F N) 3: construct MS k (T R ) = (Q R N N, A T,δN R, qnr 0, FN R) 4: crete new stte q0 s the initil one of the deterministic suffix trie Mk S (T) = (Q, A T, δ, q0, F ) 5: crete new stte q0 R s the initil one of the deterministic suffix trie MS k (T R ) = (Q R, A T,δ R, qr 0, F R) 6: define fctor(q0 ) = ε,depth(q0 ) = 0,depth(q0 R ) = 0 7: crete L, L R new empty queues of sttes 8: enqueue(l, q0 ), enqueue(lr, q0 R ) 9: while L R is not empty {construct complete M k S (T R ) in this loop} do 10: q tr dequeue(l R ) 11: for ll A T do 12: compute new stte q ur s successor of stte q tr for symol using Algorithm 2 13: discrd ll elements of d(q tr ) ut the lst one {ll successors of d(q tr ) hve just een computed} 14: end for 15: end while 16: while L is not empty {construct M S k (T) nd compute seeds in this loop} do 17: q t dequeue(l) 18: for ll A T do 19: compute new stte q u s successor of stte q t for symol using Algorithm 2 20: if exists r d(q u ) where level(r) = 0 {only stte q u tht is prt of Mk S (T) is further processed} then 21: define w = fctor(q u ) = fctor(q t ). 22: if w is seed of T using positions determined y d(q u ) (Algorithm 4) then 23: compute the smllest distnce l m of w (Algorithm 3) 24: if w > k or l m < w {ll strings of length less or equl to l m re seeds} then 25: hseeds k (T) hseeds k (T) {(w, l m )} 26: end if 27: end if 28: end if 29: end for 30: discrd ll elements of d(q t ) ut the lst one 31: end while Algorithm 4 etermine whether string w is seed of T with mximum Hmming distnce l. Input: Alredy constructed prts of deterministic suffix utomt M k S (T) = (Q, A T, δ, q 0, F) nd M k S (T R ) = (Q R, A T, δ R, q R 0, F R ), d suset t = r 1, r 2,..., r t for q Q nd w fctor(q), mximum Hmming distnce l. Output: Resolution whether w is seed of T with respect to l nd t. 1: if for ll i = 2, 3,..., t : r i r i 1 w nd p Pref 0 (w), p T r t : δ (q 0, p) = q 1, q 1 F level(r q1 d(q 1 ) ) l nd s Suff 0 (w), s r 1 w : δr (qr 0, s) = q2, q 2 F R level(r q2 d(q 2 ) ) l then 2: return true 3: else 4: return flse 5: end if (Advnce online puliction: 22 My 2009)

7 IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_ Figure 5: Trnsition digrm of the constructed prt of the suffix trie M k S (T) for string T = nd mximum Hmming distnce k = 2 from Exmple 3; dshed sttes re removed s their left lnguge do not contin exct fctor of T nd thus they re not sttes of Mk S (T) Tle 1: All seeds of string T = with mximum Hmming distnce k = 2 nd their smllest distnces l m ; p is used prefix of seed, s is used suffix (oth computed y Algorithm 4); see Exmple 3 seed d suset l m occurrences p s ,3,4,5,6,7,8 ε ε ,4,5,6,7,8 ε ε ,4,5,6,7 ε ,4,5,6,7 ε ,7,8 ε ,5,6,7,8 ε ,5,6,7 ε ,5,6,7 ε ,7,8 ε ,7,8 ε ,6,7 ε ,6,7 ε ,8 ε ,7,8 ε ε ,7 ε ,8 ε ε ,8 ε ε ε ε Figure 4: Trnsition digrm of the nondeterministic pproximte suffix utomton M k S N (T R ) for reversion of string T, i.e. T R =, nd mximum Hmming distnce k = 2 from Exmple 3 Proof y contrdiction. Let us hve following considertion: if there existed two sttes q 1, q 2 Q, q 1 q 2 nd fctor(q 1 ) = fctor(q 2 ) = w, it would men existence of two distint sequences of trnsitions: δ (q 0, w) = q 1 nd δ (q 0, w) = q 2. As Algorithm 1 cretes new stte for every A T, the resulting utomton M k S (T) is deterministic, so such distinct sequences of trnsitions for the sme string re not possile, thus either q 1 = q 2 or fctor(q 1 ) fctor(q 2 ). efinition 1. Let us consider string T nd mximum Hmming distnce k. When fctor w pproximtely (Advnce online puliction: 22 My 2009)

8 IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_05 occurs e-times in T with respect to k, we sy tht numer of repetitions of w in T with respect to k, denoted y Rw k (T), is e 1. Then numer of repetitions of ll fctors of T with respect to k, denoted y R k (T), is defined s R k (T) = Rw(T) k w Fct(T) Lemm 3. Numer of sttes of Mk S (T) is 1 2 ( T 2 + T ) R k (T) + 1 Proof. Numer of exct fctors of T is 1 2 ( T 2 + T ). As left lnguge of ech stte of Mk S (T) contins exctly one string, the numer of sttes of Mk S (T) cnnot e greter. By Lemm 2, fctor of T is contined in left lnguge of exctly one stte independent of numer of its repetitions, therefore R k (T) is sutrcted. Note 5. As restricted pproximte seeds of string T re exct fctors of T, it is meningful to consider effective lphet A T only nd A T T lwys holds (recll tht effective lphet A T consists only of symols tht occur in T). It is lso meningless to consider high k, ecuse every fctor of T hving length less or equl to k is lwys pproximte seed of T. Thus k T lwys holds. Usully k nd A T re independent of T. Lemm 4. Numer of sttes of M k S (T) constructed using Algorithm 1 is t most A T ( 1 2 ( T 2 + T ) R k (T)) + 1 Proof. By Lemm 3, numer of sttes of Mk S (T) = ( Q, A T, δ, q 0, F ) is 1 2 ( T 2 + T ) R k (T) + 1. But using Algorithm 1 there re lso constructed (ut not stored) more sttes tht hve strings not eing fctors of T in their left lnguges. For every stte q Q there is constructed successor q j for ech A T, ut not every q j is in Q. Numer of such successors vries from 0 to A T for ech stte of Mk S (T) (ut the initil one, which hs successors in Q for ll A T ), thus there could e t most A T ( 1 2 ( T 2 + T ) R k (T))+1 sttes constructed. Lemm 5. For every d suset of Mk S (T) constructed y Algorithm 1 holds tht there re no two elements hving the sme depth. Proof. It holds from properties of trnsition function of M k S N (T) = (Q N, A T, δ N, q N 0, F N): for successors of the initil stte q N 0 holds: A T : r i, r j δ N (q N 0, ) : depth(r i) depth(r j ) Therefore, d susets of successors of initil stte of M k S (T) = ( Q, A T, δ, q 0, F ) contin elements with distinct depths only. For ny successors r j of ll sttes r i of M k S N (T) ut the initil one holds: A T, r i : r j δ N (r j, ) : depth(r j ) = depth(r i ) + 1 Let us use induction. Successors of initil stte of M k S (T) hve no elements with the sme depth in its d suset. Let us consider ny stte q i Q \ { q 0 } hving no elements with the sme depth in its d suset. Any successor q j of such stte q i cnnot hve d suset hving some elements with the sme depth, s ny element r s of d(q j ) is constructed from element r i d(q i ) this wy: r s δ N (r i, ), A T nd thus depth(r s ) = depth(r i ) + 1. Therefore, the Lemm holds for ll d susets of Mk S (T). Lemm 6. Numer of elements of ll d susets of M k S (T) is not greter thn 1 2 ( T 3 + T 2 ) T R k (T) + 1 Proof. By Lemm 3, numer of sttes of Mk S (T) is t most 1 2 ( T 2 + T ) R k (T)+1. As for trnsition function of M k S N (T) = (Q N, A T, δ N, q N 0, F N ) holds: nd A T : δ N (q0 N, ) = T A T, r Q N \ {q N 0 } : δ N(r, ) 1 nd y Lemm 5, it is ovious tht for ll sttes q of M k S (T) holds d(q) T nd moreover for initil stte q 0 of Mk S (T) holds d( q 0 ) = 1. Therefore, numer of elements of ll d susets cnnot e greter thn T -times numer of sttes ut the initil one. Lemm 7. Numer of elements of ll d susets of M k S (T) constructed using Algorithm 1 is not greter thn A T ( 1 2 ( T 3 + T 2 ) T R k (T)) + 1 Proof. Clerly holds y Lemm 4 nd 6. Lemm 8. Time complexity of the check whether d suset d(q) of Mk S (T) represents seed w = fctor(q) of T (Algorithm 4) is t most tht is O( T ). 2 d(q) + 2 w 2 Proof. The check whether w covers centrl prt of T (comprison of ech two susequent elements depth) tkes 2 d(q) 2, which is O( T ) y Lemm 5. The check of existence of prefix of w to cover uncovered suffix of T tkes w, tht is O( T ), s it is found during reding (Advnce online puliction: 22 My 2009)

9 IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_05 w s input for constructed prt of Mk S (T). The check of existence of suffix of w to cover uncovered prefix of T tkes lso w, s it is found during reding w ckwrds s input for constructed prt of Mk S (T R ). Lemm 9. Time complexity of the computtion of the smllest distnce of seed w = fctor(q) of T (Algorithm 3) is t most tht is O(k T ). d(q) + k (3 d(q) + 2 w 2) Proof. Retrieving l mx tkes d(q). Then there re t most l k itertions in Algorithm 3. Ech itertion mens removl of some elements of d suset nd check whether w is still seed of T (Lemm 8). Note 6. Numer of ll seeds is O( T 2 ) (like numer of fctors). Thus, the sum of their lengths is O( T 3 ), denoted y hseeds k (T). Theorem 1. Time complexity of computtion of ll seeds with their smllest distnce for string T with mximum Hmming distnce k (Algorithm 1) is O(k A T T 3 ) Proof. Construction of nondeterministic suffix utomton M k S N (T) = (Q N, A T, δ N, q N 0, F N) for T nd k tkes O(k A T T ). For ech stte of Mk S (T) (Lemm 3) nd for ech symol of A T, new d suset is constructed. As ech element of ny d suset my e constructed in constnt time (just using lredy known δ N ) nd the elements re nturlly ordered (no need to sort proven in [6]), ll d susets re constructed in t most A T ( 1 2 ( T 3 + T 2 ) T R k (T)) + 1 time. Ech d suset is checked whether it contins element with zero level in liner time. The left lnguge extrction of stte tkes liner time nd y Lemm 8 nd 3 the theorem holds. Lemm 10. uring construction of Mk S (T) (Algorithm 1), there re t most O( T 2 ) elements of d susets stored in memory t time. Proof. Numer of fctors of T of equl length z is t most min( T z+1, A T z ). Numer of pproximte positions of such fctor is lso t most T z + 1. As Alg. 1 uses redth-first serch for the construction, there re sometimes sttes with equl length z only. In such cse, there re O( T 2 ) elements in L. Otherwise, there re stored sttes with depths z nd z + 1, so numer of elements in L stored t time is O( T 2 ) + O( T 2 ) = O( T 2 ). Theorem 2. Spce complexity of computtion of ll seeds is O( T 2 + hseeds k (T) ) Proof. Spce complexity of construction of M k S N (T) is O(k A T ) (proven in [6]). By Lemm 10, numer of elements stored in memory t time is O( T 2 ), s no more elements of d susets thn those in L plus O( T ) new re in memory t time. By Lemm 3, numer of sttes of M k S (T) is O( T 2 ) (they ll re stored in memory with one element ech). As the constructed utomton is trie, numer of trnsitions is lso O( T 2 ). The spce complexity lso depends on size of result, hseeds k (T). 5 Experimentl results The lgorithm ws implemented in C++ using STL nd compiled using GNU C with O3 optimiztions level. The dtset used to test the lgorithm is the nucleotide sequence of Scchromyces cerevisie chromosome IV 1. The string T consists of the first T chrcters of the chromosome. The first set of tests ws run on n AM Athlon (2200 MHz) system, with 2.5 GB of RAM, under Gentoo Linux operting system (see Figures 6 nd 7). Time [sec] Athlon GHz, for k=78 (solid) nd k=55 (dotted) Text length T Figure 6: Time consumption of the experimentl run on the Athlon64 with respect to the text size (see Section 5) The second set of tests ws run on n AM Athlon (1400 MHz) system, with 1.2 GB of RAM, under Gentoo Linux operting system (see Figure 8). Note 7. In comprison to experimentl results presented in [3], the lgorithm presented in this pper runs it fster for the sme dt, even on slightly slower computer (1.3 seconds in [3] for text length 100 vs. mximum 0.7 second for text length 113 see Figure 8). 6 Conclusion In this pper, we hve shown tht n lgorithm design sed on determiniztion of suffix utomton is ppro- 1 The Scchromyces cerevisie chromosome IV dtset could e downloded from (Advnce online puliction: 22 My 2009)

10 IAENG Interntionl Journl of Computer Science, 36:2, IJCS_36_2_05 Time [sec] Athlon GHz, for T =279 (solid) nd T =159 (dotted) [1] Iliopoulos, C. S., Moore,., nd Prk, K. S., Covering String, CPM 93: Proceedings of the 4th Annul Symposium on Comintoril Pttern Mtching, Springer-Verlg, London, UK, 1993, pp [2] Voráček, M. nd Melichr, B., Computing Seeds in Generlized Strings, Proceedings of Workshop 2006, Czech Technicl University in Prgue, 2006, pp Mximum distnce k Figure 7: Time consumption of the experimentl run on the Athlon64 with respect to the mximum distnce (see Section 5) Time [sec] Athlon 1.4 GHz, for T =113 (solid) nd T =149 (dotted) Mximum distnce k Figure 8: Time consumption of the experimentl run on the Athlon with respect to the mximum distnce (see Section 5) prite for computtion of ll restricted seeds with the smllest Hmming distnce. The presented lgorithm is strightforwrd, esy to understnd nd to implement nd its theoreticl nd experimentl time requirements re comprle to the existing pproch ([4]). For the future work, we would like to extend the lgorithm for serching seeds to other distnces nd to utilize similr pproch for serching other types of regulrities. Acknowledgments This reserch ws supported y the Czech Technicl University in Prgue s grnt No. CTU nd s grnt No. CTU , y the Ministry of Eduction, Youth nd Sports of the Czech Repulic under reserch progrm MSM , nd y the Czech Science Foundtion s project No. 201/06/1039 nd s project No. 201/09/0807. [3] Christodoulkis, M., Iliopoulos, C. S., Prk, K. S., nd Sim, J. S., Implementing Approximte Regulrities, Mthemticl nd Computer Modelling, Vol. 42, Octoer 2005, pp [4] Christodoulkis, M., Iliopoulos, C. S., Prk, K. S., nd Sim, J. S., Approximte Seeds of Strings, Journl of Automt, Lnguges nd Comintorics, Vol. 10, No. 5/6, 2005, pp [5] Guth, O., Serching Approximte Covers of Strings Using Finite Automt, POSTER 2008, Czech Technicl University in Prgue, Fculty of Electricl Engineering, Prh, [6] Guth, O., Melichr, B., nd Blík, M., Serching All Approximte Covers nd Their istnce using Finite Automt, Informtion Technologies Applictions nd Theory, Univerzit P. J. Šfárik, Košice, 2008, pp [7] Melichr, B., Holu, J., nd Polcr, T., Text Serching Algorithms, Volume I, Novemer 2005, Aville t [8] Melichr, B., Text Serching Algorithms, Volume II, Mrch 2006, Aville t [9] Lhod, J., Melichr, B., nd Žďárek, J., Pttern Mtching in CA Coded Text, Proceedings of the 13th Interntionl Conference on Implementtion nd Appliction of Automt, Springer, Heidelerg, 2008, pp [10] Melichr, B. nd Polcr, T., The Longest Common Susequence Prolem A Finite Automt Approch, Implementtion nd Appliction of Automt, Springer, New York, 2003, pp [11] Žďárek, J., Automt nd 2 Pttern Mtching, Advnces on Two-dimensionl Lnguge Theory, University of Slerno, Slerno, 2006, p. 15. [12] Voráček, M., Computing Covers in Generlized Strings, POSTER 2005, Czech Technicl University in Prgue, Fculty of Electricl Engineering, References (Advnce online puliction: 22 My 2009)

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38 Theory of Computtion Regulr Lnguges (NTU EE) Regulr Lnguges Fll 2017 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of Finite Automt A finite utomton hs finite set of control

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton 25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q

More information

Theory of Computation Regular Languages

Theory of Computation Regular Languages Theory of Computtion Regulr Lnguges Bow-Yw Wng Acdemi Sinic Spring 2012 Bow-Yw Wng (Acdemi Sinic) Regulr Lnguges Spring 2012 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information

Chapter 2 Finite Automata

Chapter 2 Finite Automata Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

Harvard University Computer Science 121 Midterm October 23, 2012

Harvard University Computer Science 121 Midterm October 23, 2012 Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn

More information

The size of subsequence automaton

The size of subsequence automaton Theoreticl Computer Science 4 (005) 79 84 www.elsevier.com/locte/tcs Note The size of susequence utomton Zdeněk Troníček,, Ayumi Shinohr,c Deprtment of Computer Science nd Engineering, FEE CTU in Prgue,

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016 CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages 5//6 Grmmr Automt nd Lnguges Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Prof. Mohmed Hmd Softwre Engineering L. The University of Aizu Jpn Regulr Lnguges Context Free Lnguges Context Sensitive

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

Converting Regular Expressions to Discrete Finite Automata: A Tutorial Converting Regulr Expressions to Discrete Finite Automt: A Tutoril Dvid Christinsen 2013-01-03 This is tutoril on how to convert regulr expressions to nondeterministic finite utomt (NFA) nd how to convert

More information

Java II Finite Automata I

Java II Finite Automata I Jv II Finite Automt I Bernd Kiefer Bernd.Kiefer@dfki.de Deutsches Forschungszentrum für künstliche Intelligenz Finite Automt I p.1/13 Processing Regulr Expressions We lredy lerned out Jv s regulr expression

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

More on automata. Michael George. March 24 April 7, 2014

More on automata. Michael George. March 24 April 7, 2014 More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014 CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014 CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA

More information

1 From NFA to regular expression

1 From NFA to regular expression Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work

More information

Worked out examples Finite Automata

Worked out examples Finite Automata Worked out exmples Finite Automt Exmple Design Finite Stte Automton which reds inry string nd ccepts only those tht end with. Since we re in the topic of Non Deterministic Finite Automt (NFA), we will

More information

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers 80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES 2.6 Finite Stte Automt With Output: Trnsducers So fr, we hve only considered utomt tht recognize lnguges, i.e., utomt tht do not produce ny output on ny input

More information

First Midterm Examination

First Midterm Examination Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does

More information

Deterministic Finite Automata

Deterministic Finite Automata Finite Automt Deterministic Finite Automt H. Geuvers nd J. Rot Institute for Computing nd Informtion Sciences Version: fll 2016 J. Rot Version: fll 2016 Tlen en Automten 1 / 21 Outline Finite Automt Finite

More information

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1 Non-Deterministic Finite Automt Fll 2018 Costs Busch - RPI 1 Nondeterministic Finite Automton (NFA) Alphbet ={} q q2 1 q 0 q 3 Fll 2018 Costs Busch - RPI 2 Nondeterministic Finite Automton (NFA) Alphbet

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-* Regulr Expressions (RE) Regulr Expressions (RE) Empty set F A RE denotes the empty set Opertion Nottion Lnguge UNIX Empty string A RE denotes the set {} Alterntion R +r L(r ) L(r ) r r Symol Alterntion

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont. NFA DFA Exmple 3 CMSC 330: Orgniztion of Progrmming Lnguges NFA {B,D,E {A,E {C,D {E Finite Automt, con't. R = { {A,E, {B,D,E, {C,D, {E 2 Equivlence of DFAs nd NFAs Any string from {A to either {D or {CD

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018 CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

Name Ima Sample ASU ID

Name Ima Sample ASU ID Nme Im Smple ASU ID 2468024680 CSE 355 Test 1, Fll 2016 30 Septemer 2016, 8:35-9:25.m., LSA 191 Regrding of Midterms If you elieve tht your grde hs not een dded up correctly, return the entire pper to

More information

Formal Language and Automata Theory (CS21004)

Formal Language and Automata Theory (CS21004) Forml Lnguge nd Automt Forml Lnguge nd Automt Theory (CS21004) Khrgpur Khrgpur Khrgpur Forml Lnguge nd Automt Tle of Contents Forml Lnguge nd Automt Khrgpur 1 2 3 Khrgpur Forml Lnguge nd Automt Forml Lnguge

More information

NFAs continued, Closure Properties of Regular Languages

NFAs continued, Closure Properties of Regular Languages Algorithms & Models of Computtion CS/ECE 374, Fll 2017 NFAs continued, Closure Properties of Regulr Lnguges Lecture 5 Tuesdy, Septemer 12, 2017 Sriel Hr-Peled (UIUC) CS374 1 Fll 2017 1 / 31 Regulr Lnguges,

More information

Lexical Analysis Finite Automate

Lexical Analysis Finite Automate Lexicl Anlysis Finite Automte CMPSC 470 Lecture 04 Topics: Deterministic Finite Automt (DFA) Nondeterministic Finite Automt (NFA) Regulr Expression NFA DFA A. Finite Automt (FA) FA re grph, like trnsition

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

Let's start with an example:

Let's start with an example: Finite Automt Let's strt with n exmple: Here you see leled circles tht re sttes, nd leled rrows tht re trnsitions. One of the sttes is mrked "strt". One of the sttes hs doule circle; this is terminl stte

More information

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck. Outline Automt Theory 101 Rlf Huuck Introduction Finite Automt Regulr Expressions ω-automt Session 1 2006 Rlf Huuck 1 Session 1 2006 Rlf Huuck 2 Acknowledgement Some slides re sed on Wolfgng Thoms excellent

More information

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science CSCI 340: Computtionl Models Kleene s Theorem Chpter 7 Deprtment of Computer Science Unifiction In 1954, Kleene presented (nd proved) theorem which (in our version) sttes tht if lnguge cn e defined y ny

More information

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers Speech Recognition Lecture 2: Finite Automt nd Finite-Stte Trnsducers Eugene Weinstein Google, NYU Cournt Institute eugenew@cs.nyu.edu Slide Credit: Mehryr Mohri Preliminries Finite lphet, empty string.

More information

State Minimization for DFAs

State Minimization for DFAs Stte Minimiztion for DFAs Red K & S 2.7 Do Homework 10. Consider: Stte Minimiztion 4 5 Is this miniml mchine? Step (1): Get rid of unrechle sttes. Stte Minimiztion 6, Stte is unrechle. Step (2): Get rid

More information

Finite Automata-cont d

Finite Automata-cont d Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww

More information

CISC 4090 Theory of Computation

CISC 4090 Theory of Computation 9/6/28 Stereotypicl computer CISC 49 Theory of Computtion Finite stte mchines & Regulr lnguges Professor Dniel Leeds dleeds@fordhm.edu JMH 332 Centrl processing unit (CPU) performs ll the instructions

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh Lnguges nd Automt Finite Automt Informtics 2A: Lecture 3 John Longley School of Informtics University of Edinburgh jrl@inf.ed.c.uk 22 September 2017 1 / 30 Lnguges nd Automt 1 Lnguges nd Automt Wht is

More information

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2016 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 9 1. (4pts) ((p q) (q r)) (p r), prove tutology using truth tles. p

More information

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51 Non Deterministic Automt Linz: Nondeterministic Finite Accepters, pge 51 1 Nondeterministic Finite Accepter (NFA) Alphbet ={} q 1 q2 q 0 q 3 2 Nondeterministic Finite Accepter (NFA) Alphbet ={} Two choices

More information

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS CS 310 (sec 20) - Winter 2003 - Finl Exm (solutions) SOLUTIONS 1. (Logic) Use truth tles to prove the following logicl equivlences: () p q (p p) (q q) () p q (p q) (p q) () p q p q p p q q (q q) (p p)

More information

a,b a 1 a 2 a 3 a,b 1 a,b a,b 2 3 a,b a,b a 2 a,b CS Determinisitic Finite Automata 1

a,b a 1 a 2 a 3 a,b 1 a,b a,b 2 3 a,b a,b a 2 a,b CS Determinisitic Finite Automata 1 CS4 45- Determinisitic Finite Automt -: Genertors vs. Checkers Regulr expressions re one wy to specify forml lnguge String Genertor Genertes strings in the lnguge Deterministic Finite Automt (DFA) re nother

More information

Thoery of Automata CS402

Thoery of Automata CS402 Thoery of Automt C402 Theory of Automt Tle of contents: Lecture N0. 1... 4 ummry... 4 Wht does utomt men?... 4 Introduction to lnguges... 4 Alphets... 4 trings... 4 Defining Lnguges... 5 Lecture N0. 2...

More information

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun: CMPU 240 Lnguge Theory nd Computtion Spring 2019 NFAs nd Regulr Expressions Lst clss: Introduced nondeterministic finite utomt with -trnsitions Tody: Prove n NFA- is no more powerful thn n NFA Introduce

More information

CHAPTER 1 Regular Languages. Contents

CHAPTER 1 Regular Languages. Contents Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr

More information

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30 Tlen en Automten Test 1, Mon 7 th Dec, 2015 15h45 17h30 This test consists of four exercises over 5 pges. Explin your pproch, nd write your nswer to ech exercise on seprte pge. You cn score mximum of 100

More information

1.3 Regular Expressions

1.3 Regular Expressions 56 1.3 Regulr xpressions These hve n importnt role in describing ptterns in serching for strings in mny pplictions (e.g. wk, grep, Perl,...) All regulr expressions of lphbet re 1.Ønd re regulr expressions,

More information

BACHELOR THESIS Star height

BACHELOR THESIS Star height BACHELOR THESIS Tomáš Svood Str height Deprtment of Alger Supervisor of the chelor thesis: Study progrmme: Study rnch: doc. Štěpán Holu, Ph.D. Mthemtics Mthemticl Methods of Informtion Security Prgue 217

More information

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings...

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings... Tle of contents: Lecture N0.... 3 ummry... 3 Wht does utomt men?... 3 Introduction to lnguges... 3 Alphets... 3 trings... 3 Defining Lnguges... 4 Lecture N0. 2... 7 ummry... 7 Kleene tr Closure... 7 Recursive

More information

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!) CMSC 330: Orgniztion of Progrmming Lnguges DFAs, nd NFAs, nd Regexps (Oh my!) CMSC330 Spring 2018 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All

More information

Regular Languages and Applications

Regular Languages and Applications Regulr Lnguges nd Applictions Yo-Su Hn Deprtment of Computer Science Yonsei University 1-1 SNU 4/14 Regulr Lnguges An old nd well-known topic in CS Kleene Theorem in 1959 FA (finite-stte utomton) constructions:

More information

Farey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University

Farey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University U.U.D.M. Project Report 07:4 Frey Frctions Rickrd Fernström Exmensrete i mtemtik, 5 hp Hledre: Andres Strömergsson Exmintor: Jörgen Östensson Juni 07 Deprtment of Mthemtics Uppsl University Frey Frctions

More information

Module 9: Tries and String Matching

Module 9: Tries and String Matching Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer

More information

Module 9: Tries and String Matching

Module 9: Tries and String Matching Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer

More information

Tutorial Automata and formal Languages

Tutorial Automata and formal Languages Tutoril Automt nd forml Lnguges Notes for to the tutoril in the summer term 2017 Sestin Küpper, Christine Mik 8. August 2017 1 Introduction: Nottions nd sic Definitions At the eginning of the tutoril we

More information

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA) Finite Automt (FA or DFA) CHAPTER Regulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, equivlence of NFAs DFAs, closure under regulr

More information

CS103 Handout 32 Fall 2016 November 11, 2016 Problem Set 7

CS103 Handout 32 Fall 2016 November 11, 2016 Problem Set 7 CS103 Hndout 32 Fll 2016 Novemer 11, 2016 Prolem Set 7 Wht cn you do with regulr expressions? Wht re the limits of regulr lnguges? On this prolem set, you'll find out! As lwys, plese feel free to drop

More information

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers. Mehryar Mohri Courant Institute and Google Research

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers. Mehryar Mohri Courant Institute and Google Research Speech Recognition Lecture 2: Finite Automt nd Finite-Stte Trnsducers Mehryr Mohri Cournt Institute nd Google Reserch mohri@cims.nyu.com Preliminries Finite lphet Σ, empty string. Set of ll strings over

More information

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz University of Southern Cliforni Computer Science Deprtment Compiler Design Fll Lexicl Anlysis Smple Exercises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sciences Institute 4676 Admirlty Wy, Suite

More information

Model Reduction of Finite State Machines by Contraction

Model Reduction of Finite State Machines by Contraction Model Reduction of Finite Stte Mchines y Contrction Alessndro Giu Dip. di Ingegneri Elettric ed Elettronic, Università di Cgliri, Pizz d Armi, 09123 Cgliri, Itly Phone: +39-070-675-5892 Fx: +39-070-675-5900

More information

Some Theory of Computation Exercises Week 1

Some Theory of Computation Exercises Week 1 Some Theory of Computtion Exercises Week 1 Section 1 Deterministic Finite Automt Question 1.3 d d d d u q 1 q 2 q 3 q 4 q 5 d u u u u Question 1.4 Prt c - {w w hs even s nd one or two s} First we sk whether

More information

NFAs continued, Closure Properties of Regular Languages

NFAs continued, Closure Properties of Regular Languages lgorithms & Models of omputtion S/EE 374, Spring 209 NFs continued, losure Properties of Regulr Lnguges Lecture 5 Tuesdy, Jnury 29, 209 Regulr Lnguges, DFs, NFs Lnguges ccepted y DFs, NFs, nd regulr expressions

More information

Closure Properties of Regular Languages

Closure Properties of Regular Languages Closure Properties of Regulr Lnguges Regulr lnguges re closed under mny set opertions. Let L 1 nd L 2 e regulr lnguges. (1) L 1 L 2 (the union) is regulr. (2) L 1 L 2 (the conctention) is regulr. (3) L

More information

Finite-State Automata: Recap

Finite-State Automata: Recap Finite-Stte Automt: Recp Deepk D Souz Deprtment of Computer Science nd Automtion Indin Institute of Science, Bnglore. 09 August 2016 Outline 1 Introduction 2 Forml Definitions nd Nottion 3 Closure under

More information

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh Finite Automt Informtics 2A: Lecture 3 Mry Cryn School of Informtics University of Edinburgh mcryn@inf.ed.c.uk 21 September 2018 1 / 30 Lnguges nd Automt Wht is lnguge? Finite utomt: recp Some forml definitions

More information

Lecture 9: LTL and Büchi Automata

Lecture 9: LTL and Büchi Automata Lecture 9: LTL nd Büchi Automt 1 LTL Property Ptterns Quite often the requirements of system follow some simple ptterns. Sometimes we wnt to specify tht property should only hold in certin context, clled

More information

Languages & Automata

Languages & Automata Lnguges & Automt Dr. Lim Nughton Lnguges A lnguge is sed on n lphet which is finite set of smols such s {, } or {, } or {,..., z}. If Σ is n lphet, string over Σ is finite sequence of letters from Σ, (strings

More information

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010 CS 311 Homework 3 due 16:30, Thursdy, 14 th Octoer 2010 Homework must e sumitted on pper, in clss. Question 1. [15 pts.; 5 pts. ech] Drw stte digrms for NFAs recognizing the following lnguges:. L = {w

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More

More information

FABER Formal Languages, Automata and Models of Computation

FABER Formal Languages, Automata and Models of Computation DVA337 FABER Forml Lnguges, Automt nd Models of Computtion Lecture 5 chool of Innovtion, Design nd Engineering Mälrdlen University 2015 1 Recp of lecture 4 y definition suset construction DFA NFA stte

More information

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy:

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy: Scnner Specifying ptterns source code tokens scnner prser IR A scnner must recognize the units of syntx Some prts re esy: errors mps chrcters into tokens the sic unit of syntx x = x + y; ecomes

More information

Non-deterministic Finite Automata

Non-deterministic Finite Automata Non-deterministic Finite Automt From Regulr Expressions to NFA- Eliminting non-determinism Rdoud University Nijmegen Non-deterministic Finite Automt H. Geuvers nd J. Rot Institute for Computing nd Informtion

More information

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS The University of Nottinghm SCHOOL OF COMPUTER SCIENCE LEVEL 2 MODULE, SPRING SEMESTER 2016 2017 LNGUGES ND COMPUTTION NSWERS Time llowed TWO hours Cndidtes my complete the front cover of their nswer ook

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

Automata and Languages

Automata and Languages Automt nd Lnguges Prof. Mohmed Hmd Softwre Engineering Lb. The University of Aizu Jpn Grmmr Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Regulr Lnguges Context Free Lnguges Context Sensitive

More information

Fundamentals of Computer Science

Fundamentals of Computer Science Fundmentls of Computer Science Chpter 3: NFA nd DFA equivlence Regulr expressions Henrik Björklund Umeå University Jnury 23, 2014 NFA nd DFA equivlence As we shll see, it turns out tht NFA nd DFA re equivlent,

More information