Succinct Text Indexes on Large Alphabet

Size: px
Start display at page:

Download "Succinct Text Indexes on Large Alphabet"

Transcription

1 Succinct Text Indexes on Lrge Alphet Meng Zhng 1, Jijun Tng 2, Dong Guo 1, Ling Hu 1, nd Qing Li 1 1 College of Computer Science nd Technology, Jilin University, Chngchun , Chin zm@mil.edu.cn, {guodg, li qing}@jlu.edu.cn, hul@mil.jlu.edu.cn, 2 Deprtment of Computer Science nd Engineering, University of South Crolin, USA jtng@cse.sc.edu Astrct. In this pper, we first consider some properties of strings who hve the sme suffix rry. Next, we design dt structure to support rnk nd select opertions on n lphet Σ using nlog Σ + o(nlog Σ ) its in O(log Σ ) time for text of length n. It lso supports n extended rnk, nmely rnk, such tht rnk α (T, i) returns the numer of letters which re not smller thn α in string T, plus the numer of αs up to position i. Also, it runs in O(log Σ ) time. By this structure, we implement the DAWG succinctly. The min structure only tkes nlog Σ + o(nlog Σ ) its nd supports sic opertions of DAWG efficiently. 1 Introduction Given text string, full-text indexes re dt structures tht cn e used to find ny sustring of the text quickly. Mny full-text indexes hve een proposed, such s suffix trees [8, 16], DAWGs [3] nd suffix rrys [7, 12]. However, the mjor drwck tht limits the pplicility of full-text indexes is their spce complexity the size of full-text indices re quite lrger thn the originl text. Stndrd representtions of suffix tree require 4nlogn its spce, where log denotes the logrithm se 2. Suffix rry ws proposed to reduce the spce cost of suffix trees. It consists of the vlues of the leves of the suffix tree in in-order, ut without the tree structure informtion, hence tkes only nlogn its. Recent reserches re focused on reducing the sizes of full-text indices [6, 9, 10, 15]. The compressed suffix rry structure [9] proposed y Grossi nd Vitter is the first method tht reduces the size of the suffix rry from O(nlogn) its to O(n) its nd supports ccess to ny entry of the originl suffix rry in O(log ɛ Σ n) time, for ny fixed constnt 0 < ɛ < 1 (without computing the entire originl suffix rry). FM-index proposed y Ferrgin nd Mnzini [6] is self-index dt structure Supported y NSF of Chin No nd Foundtion of Young Scientist of Jilin Province No Corresponding uthor.

2 2 Zhng Meng et l. with good compression rtio nd fst decompressing speed. The FM-index occupies t most 5nH k (T) + o(n) its of storge nd llows the serch for the occ occurrences of pttern P[1..p] within T in O(p + occlog 1+ε n) time. He et l. [10] present succinct representtion of suffix rrys of inry strings tht uses n + o(n) its. For the cse of lrge lphet, they suggested n pproch which conceptully sets it vector for ech lphet symol to support opertions rnk nd select, nd uses wvelet tree in the ctul implementtion to sve spce. They lso proved ctegoriztion theorem y which one cn determine whether given permuttion is the suffix rry for inry string. In this pper, we study the sme prolem over lrge lphet, nd develop spce-economicl method to solve the prolem. We first descrie some properties of strings whose suffix rry is given permuttion, such s t lest how mny different letters must occur in such strings. We then present dt structure tht supports rnk nd select opertions on lrge lphet using t most nlog Σ + o(nlog Σ ) its in O(log Σ ) time. Although these opertions cn e implemented to run in constnt time [6, 10], the dditionl spce occuption will e uncceptle if log Σ cn not e neglected. Thus, it is resonle to mke the opertions run in O(log Σ ) time to sve spce. The dt structure lso supports n extended rnk, nmely rnk, which lso runs in O(log Σ ) time without using ny dditionl spce. More precisely, rnk α (T, i) returns the numer of letters which re not smller thn α in string T, plus the numer of αs up to position i. Function rnk plys crucil role in succinct index for lrge lphet. In [6, 10], the sme function of rnk is performed vi tle of Σ logn its which for ech symol stores the numer of chrcters in the text tht lexicogrphiclly precede it. Bsed on this index, we implement the DAWG [3, 5] in succinct wy, not storing the sttes nd edges explicitly. The min structure only tkes nlog Σ + o(nlog Σ ) its for text of length n on n lphet Σ nd supports sic opertions of DAWG without loss of speed. 2 Bsic Definitions Let Σ e nonempty lphet nd Σ e the numer of symols in Σ. Let T = t 1 t 2...t n e word over Σ, T denotes its length, T[i] or t i its i th letter, nd T[i..] its suffix tht egins t position i, T[i..j] its sustring egins t i ends t j, 1 i j n. Let T R e the reverse string of T nd Suff(T) the set of ll suffixes of T nd Fct(T) the set of its fctors. Definition 1 For string T of length n over n ordered lphet Σ. Denote the set of different letters occur in T y A(T). A(T), function Order T () returns the numers of letters in A(T) tht is not greter thn. For termintion chrcter, Order T () = 0. T denotes the string of length n over integer lphet, such tht T [i] = Order T (T[i]). The rnk nd select opertion ply importnt roles in succinct dt structures. Function rnk 1 (B, i) nd rnk 0 (B, i) return the numer of 1s nd 0s in the it vector B[1..n] up to position i, respectively.

3 Succinct Text Indexes on Lrge Alphet 3 Lemm 1. [11]The rnk function cn e computed in constnt time y using dt structure of size n + o(n) its. Function select 1 (B, i) nd select 0 (B, i) return the positions of i th 1 nd 0, respectively. Lemm 2. [14]The select function cn e computed in constnt time y using dt structure of size n + o(n) its. For convenience, we use rnk (B), {0, 1}, to denote rnk (B, n). We will lso use rnk (B[s..i]), 1 s i n, to denote rnk 1 (B, i) rnk 1 (B, s 1). Becuse rnk runs in constnt time, rnk (B[s..i]) lso runs in constnt time. 3 Permuttions nd Suffix Arrys Permuttion P cn e treted s string over lphet {1, 2,..., n}. Becuse P[P 1 [1]] = 1 < P[P 1 [2]] = 2 <... < P[P 1 [n]] = n, where P 1 denotes the inverse permuttion of P, it is pprent tht the suffix rry of this string is P 1 nd vice vers. Among the strings who hve the sme suffix rry, the numer of different letters occur in ech string cn e different. The question is how to compute the miniml numer of different letters occur in such strings. The core of our solution is simple fct, tht is, for string T nd its suffix rry P, if T[P[i 1]] = T[P[i]], then T P[i 1]+1 < T P[i]+1, ecuse T P[i 1] < T P[i]. Therefore, if T P[i 1]+1 < T P[i]+1 is true then T[P[i]] cn e ny letter not less thn T[P[i 1]], including T[P[i 1]], such tht the suffix rry of T is still P; otherwise it must e greter thn T[P[i 1]]. According to this fct, we define the specil positions in P tht increse the numer of symols must occur in T. Definition 2 Given permuttion P of {1, 2,..., n}. For 1 i n, we cll i n incresing position of P, if i = 1 or P 1 [P[i 1] + 1] > P 1 [P[i] + 1]. Since is the miniml letter, therefore 1 is n incresing position of P. To chieve this, we ssume tht for ny permuttion P of {1, 2,..., n}, P[0] = n+1, P[n + 1] = n + 2. Thus for ny string T of length n, T[n + 2] should e greter thn ny chrcters in T. Denote the set of incresing positions of P y IP(P) nd the numer of incresing positions in P y ic(p). Let I e n incresing position of P, denote the minimum incresing position greter thn I y ν(i). Denote the mximl non-incresing position greter thn I such tht there is no incresing position etween I nd this position y κ(i). Definition 3 Given permuttion P of {1, 2,..., n}. Let T e string over n ordered lphet. For ny incresing position of P, sy I, if t P[I] t P[I+1]... t P[κ(I)] nd t P[κ(I)] < t P[ν(I)] if ν(i) exists, then T is clled generting string of P. Denote the set of generting string of P y G(P). The following theorem summrizes the property of strings who hve the sme suffix rry. The proof cn e found in [17].

4 4 Zhng Meng et l. Theorem 1. The suffix rry of T is P if nd only if T G(P). By theorem 1, the following is immedite. Theorem 2. Given permuttion P of {1, 2,..., n}. For ny string T whose suffix rry is P, the numer of different letters tht occur in T is t lest ic(p). By theorem 2, one cn determine t lest how mny different letters must occur in string whose suffix rry is given permuttion. The sme result ws first reveled y Bnni et l. [2]. To generte the strings whose suffix rrys re P. We cn set ech letter on position P[i] of T from P[1] to P[n]. First, the letter T[P[1]] must e the miniml letter of the lphet. Becuse P is treted s suffix rry, thus T P[i 1] < T P[i] nd T[P[i 1]] T[P[i]]. If i is not n incresing position then T[P[i 1]] nd T[P[i]] cn e set to the sme letter, otherwise T[P[i 1]] < T[P[i]]. Bnni et l. [2] presented n lgorithm to generte the string consisting ic(p) different letters whose suffix rry is P. The input of the lgorithm is P nd string w whose suffix rry is P. If w is not ville, P 1 cn tke this role. Then the lgorithm is the sme s ours. In [10], He et l. gve n lgorithm tht checks whether permuttion is the suffix rry for given inry string. According to the theorem 1, the check over lrge lphet cn e done y testing whether T G(P). Precisely, for ll i = 1,...,n 1, if there exits i, such tht T[P[i]] < T[P[i 1]] or T[P[i 1]] = T[P[i]] nd P 1 [P[i] + 1] < P 1 [P[i 1] + 1], then P is not the suffix rry for T. Otherwise, we hve T P[1] < T P[2] < < T P[n] ; therefore, P is the suffix rry for T. This simple lgorithm, of course, cn e used to check whether permuttion is the suffix rry of given inry string. 4 Succinct Indexes on Lrge Alphet For n internl node u of the suffix tree, where u is the longest string in the node, ll the occurrences of u re grouped consecutively in suffix rry of T, sy SA. Therefore, u cn e represented y n intervl [s, e] over SA where ll suffixes with prefix u re included nd SA[s] is the lexiclly smllest one, SA[e] is the lexiclly gretest one [1, 10]. In this pper, we use intervl to represent the sttes of DAWG nd give n implementtion of DAWG which is succinct nd fst. First, recll the definition of DAWG. For ny string u Σ, let u 1 S = {x ux S}. The syntctic congruence ssocited with Suff(w) is denoted y Suff(w) [3] nd is defined, for x, y, w Σ, y x Suff(w) y x 1 Suff(w) = y 1 Suff(w). We cll clsses of fctors the congruence clsses of the reltion Suff(w). Let [u] w denote the congruence clss of u Σ under Suff(w). The longest element in the equivlence clss [u] w is clled its representtive, denoted y rp([u] w ). Definition 4 The DAWG of w is directed cyclic grph with set of sttes {[u] w u Fct(w)} nd set of edges {([u] w,, [u] w ) u, u Fct(w), Σ}. Denoted y DAWG(w).The stte [ε] w is clled the root of DAWG(w).

5 Succinct Text Indexes on Lrge Alphet 5 The suffix link of stte p is the stte whose representtive v is the longest suffix of u such tht v not Suff(w) u. The suffix link is useful for mny string pplictions. In stte r of DAWG(T), ny string is suffix of rp(r). And if sustring in r ends t position i of T then other sustrings in r lso end t i. Therefore, the nodes nd suffix links of DAWG(T) form the suffix tree of reverse string of T [3]. Thus stte of DAWG(T), which corresponds to node in suffix tree of T R, cn e represented y n intervl of suffix rry of T R. Denote the suffix rry of T R y SA, for stte r, if r = [s, e], then for ny suffix of T R, sy Ti R, TSA R [s] T i R TSA R [e] if rp(r)r is prefix of Ti R where TSA R [s] is the lexiclly smllest suffix of T R of which rp(r) R is prefix, nd TSA R [e] is the gretest one of which rp(r) R is prefix. Fig. 1 shows such n exmple ' 3' () SA 0 8 [0,7] 1 [1,4] 7 3' [3,4] 2' [5,7] 2 [5,6] 7 [2,2] 3 [3,3] 6 [4,4] 4 [6,6] 5 [7,7] () 3 Fig.1. () The DAWG for. () The suffix tree for R. Ech node of the tree corresponds to DAWG stte mrked with sme numer. Ech edge corresponds to suffix link of DAWG. The intervl for ech node is shown on the right. The edges of DAWG lso need not stored explicitly. The stte trnsction, which occurs in the process of scnning n input pttern to find its occurrences in T y DAWG, cn e mpped to the chnging of intervl. For current stte s nd input letter, the stte trnsction is to find the stte which rp(s) is in, denoted y goto(s, ). For intervl [, e] nd input letter, we need to find the intervl corresponding to goto([, e], ). We define new succinct dt structure tht performs goto function. The dt structure extends the index in [10] to the cse of lrge lphet. Let T[0] = nd T[n + 1] =. Define n rry ˆT of size n + 1 s follows: ˆT[i + 1] = { T[1], if i = 0, T R [SA [i] 1] = T[n + 2 SA [i]], if 1 i n. Ech entry of this rry stores the chrcter fter ech prefix of T in the lexicl order of reverse strings of prefixes. For exmple, for T =, ˆT =

6 6 Zhng Meng et l.. ˆT is the result of Burrows-Wheeler trnsform(bwt) [4] on T R. The BWT result on T is first used in FM-index [6]. To del with the sitution of lrge lphet, we extend opertion rnk to lrge lphet. Opertion rnk α ( ˆT, i) returns the numer of αs in ˆT up to position i, where α Σ, Σ 2. We lso define nother useful opertion rnk. rnk α ( ˆT, i) returns the numer of letters which re not smller thn α in string T, plus the numer of αs up to position i. For it vectors, the rnk opertion cn e done in constnt time [11]. For rrys over lphet Σ, we develop method y which the rnk nd rnk opertion cn run in O(log Σ ) time, which will e descried in the next section. The following lgorithm determines whether pttern w occurs in given string T. It is similr to the reverse of BW count lgorithm of FM-index[6]. Scn(w) 1 s 1; e n for i 1 to w do 3 w[i] 4 s rnk ( ˆT, s 1) e rnk ( ˆT, e) 6 if s > e then 7 report w is not sustring of T 8 end if 9 end for 10 report w is sustring of T This function implements the DAWG existentil query. The DAWG stte [w[1..i]] T corresponding to intervl [s 1, e 1] is computed in ech step i of Scn. Chnging of intervl in ech step is corresponding to the stte chnging of DAWG existentil query. In the end, ll the suffixes of T R of which w R is the prefix re in [s 1, e 1] of SA. By this procedure, the numer of occurrences of pttern cn lso e computed, which is e s+1, the numer of suffixes in intervl [s 1, e 1]. The running time of these queries re O( w log Σ ), ecuse the running time of rnk is O(log Σ ). The speed is not slowed down compring to other implementtions [5] of DAWG. 5 Implementing Rnk nd Select on Lrge Alphet in O(log Σ ) Time with nlog Σ + o(nlog Σ ) Bits 5.1 The New Index Structure In this section, we use E to refer to ˆT nd denote log Σ y N. For it vector V, denote the i th it of V y V i, the it segment of V from i th it to j th it y V [i..j]. Our index consists series of it vectors of length n, E N, E N 1,..., E 1, computed from E. First, E N is defined s follows: E N i = E[i] N, 1 i n.

7 Succinct Text Indexes on Lrge Alphet 7 Bit vector E N 1 is defined s follows: E N 1 i = { E[select0(E N, i)] N 1, if 1 i rnk 0(E N ) nd rnk 0(E N ) 0 E[select 1(E N, i)] N 1, if rnk 0(E N ) < i n nd rnk 1(E N ) 0. Generlly speking, the it vector E k, 1 k < N, cn e constructed y the following procedures: First, order the positions in E y the most significnt N k its of the integers on them. For positions on which the integers hve the sme first N k its, keep their order in E. This procedure gives us series of positions: pos 1, pos 2,..., pos n. Second, set Ei k, 1 i n, to the kth significnt it of the integer on position pos i of E. Precisely, pos 1, pos 2,..., pos n re divided into 2 N k groups noted y G k 0,..., G k 2 N k 1 ; items of E on positions in Gk i hve the sme first N k its, which equl to the inry representtion of i. Accordingly, the vector E k is composed of 2 N k non-overlpping segments S0,. k.., S2 k N k 1. The its in Sk i re the k th most significnt its of items of E on positions in G i j ; the order of these its is in ccordnce with the order of positions in E. Denote the strt position of Si k in E k y F(Si k) nd the end position of Sk i in E k y L(Si k). EN hs only one segment S N = E N nd F(S N ) = 1, L(S N ) = n, where denotes the empty letter. Segments of E k, 1 k < N, is defined recursively s follows: For t from { 0 to 2 N k 1 S2t k E k [F(S k+1 t ).. F(S k+1 t ) + rnk = 0(S k+1 t ) 1], if rnk 0(S k+1 t ),, otherwise; { S2t+1 k E k [F(S k+1 t ) + rnk = 0(S k+1 t ).. L(S k+1 t )], if rnk 1(S k+1 t ), otherwise. Let E (k) denote the rry of length n, such tht E (k) [i] = E[i][N..k], 1 i n, where E[i] is treted s it vector. Then E k, 1 k < N, is defined recursively s follows: For t from 0 to 2 N k 1 Ei k = E[select 2t(E (k), i F(S k+1 t ) + 1)] k, if F(S k+1 t ) i F(S k+1 t ) + rnk 0(S k+1 nd S k+1 t, t ) E[select 2t+1(E (k), i F(S k+1 t ) rnk 0(S k+1 t ))] k, if F(S k+1 t ) + rnk 0(S k+1 t ) < k L(S k+1 nd S k+1 t ; Fig. 2 gives n exmple of this structure. Conceptully, the it vector E 0 is divided into Σ segments nd ll the elements re set to which denotes the empty letter. The numer of elements in segment S 0 is equl to rnk (E). In prctice, multi-key rnk nd select cn e computed without E 0. t ) 5.2 Rnk nd Select on Lrge Alphet We store the it vectors E N,E N 1,...,E 1 in continuous memory, nd tke them s one it vector, nmed E. Tht is E k = E[(k 1)n kn]. We uild rnk structures over E using the structure of [11]. The opertions on ny E k, sy

8 8 Zhng Meng et l. S 3 E i E S0 2 S1 2 E S0 1 S1 1 S2 1 S3 1 E S 0 0 S 0 1 S 0 2 S 0 3 S 0 4 S 0 5 S 0 6 S 0 7 E 0 Fig.2. Exmple of the it vectors for E = denotes the empty letter. rnk α (E k, i), cn e computed y rnk α (E, (k 1)n+i) rnk α (E, (k 1)n). E with corresponding rnk structure re our min indexing dt structure, which together use nlog Σ + o(nlog Σ ) its. The opertion select defined on lrge lphet cn lso e implemented using this dt structure. It is the reverse computing of rnk. select α (E, i) returns the index of i th α chrcter of E. The lgorithm of rnk nd select is given elow. To e cler, we do not use E in explining the lgorithm ut use seprted it vectors. rnk (E, end) 1 s 1; e n; c end 2 for k log Σ downto 1 do 3 c rnk k (E k [s..c]) 4 if k = 1 then 5 s s + rnk 0 (E k [s..e]) 6 else 7 e e rnk 1 (E k [s..e]) 8 end if 9 end for 10 return c select (E, count) 1 s N+1 1; e N+1 n; c count 2 for i log Σ downto 1 do 3 if i = 1 then 4 s i s i+1 +rnk 0 (E i [s i+1..e i+1 ]) 5 else 6 e i e i+1 rnk 1 (E i [s i+1..e i+1 ]) 7 end if 8 end for 9 for i 1 to log Σ do 10 c select i (E i [s i..c]) 11 end for 12 return c In lgorithm rnk (E, end), t the end of ech step k, the strt position nd end position (s nd e), of segment S k 1 N N 1 k re computed, where the numer of symols up to position end, whose first N k + 1 most significnt its re N k, sy c, is lso ville. In step k + 1, these vlues cn e computed from the vlues in step k. Therefore in the end, the numer of s in E cn e computed. Thus the rnk lgorithm is correct. By Lemm 1, ech itertion of function rnk runs in constnt time. Therefore, y this lgorithm, the numer of s in rry E of length n over n ritrry lphet Σ cn e clculted in O(log Σ ) time using nlog Σ + o(nlog Σ ) its.

9 Succinct Text Indexes on Lrge Alphet 9 Thus we cn find the intervl corresponding to goto(s, ) in O(log Σ ) time. The totl spce for succinct DAWG is nlog Σ + 4n + o(nlog Σ ) its. Becuse when rnk α (E, end) finished, the s 1 equls to the numer of letters which re not smller thn α in E. Therefore, y replcing line 10 of the lgorithm for rnk with the following sttement, lgorithm rnk is ville. return s + c 1. We next consider the correctness of select lgorithm. Denote the s, e nd c of ech itertion in the running of the for loop of rnk (E, end) y s i, e i nd c i, where i is the vlue of loop vrile nd c N+1 = end. The sequence of the computing of s i, e i nd c i in rnk is s follows: c N = rnk N (E N [s N..e N], c N+1) c N 1 = rnk N 1 (E N 1 [s N 1..e N 1], c N). c 1 = rnk 1 (E 1 [s 1..e 1], c 2) Denote the vlue of c in ech for loop in lines 9-11 of select y c k (k is the loop vrile) nd denote the initil vlue of c y c 1 = count, then select (E, count)=c N+1. Since the first for loop of function select (E, i) is to compute rnk (E), then s i = s i nd e i = e i. According to the select lgorithm, the sequence of computing of c i is s follows (we replce s i, e i with s i, e i ): c 2 = select 1 (E 1 [s 1..e 1], c 1) c 3 = select 2 (E 2 [s 2..e 2], c 2). c N+1 = select N (E N [s N..e N], c N) Immeditely, we get c N = c N nd c i = c i for 1 i N. If E[end] =, c N+1 = c N+1. Therefore select (E, count) = end nd the select lgorithm is correct. 6 Conclusions We study the properties of strings whose suffix rry is given permuttion, nd lso give succinct index structure for lrge lphet. The core is dt structure tht supports rnk nd select on lrge lphet using nlog Σ + o(nlog Σ ) its in O(log Σ ) time. There re still mny interesting issues, such s idirectionl index sed on this structure, the reltion etween strings with inverse suffix rrys. There re works to e done to revel these structures. References 1. M. I. Aouelhod, E. Ohleusch, nd S. Kurtz. Optiml exct string mtching sed on suffix rrys. In Proc. 9th Interntionl Symposium on String Processing nd Informtion Retrievl (SPIRE02), LNCS 2476, pges Springer-Verlg, 2002.

10 10 Zhng Meng et l. 2. Hideo Bnni, Shunsuke Ineng, Ayumi Shinohr, nd Msyuki Tked. Inferring strings from grphs nd rrys. In Proc. 28th Interntionl Symposium on Mthemticl Foundtions of Computer Science (MFCS 2003), LNCS 2747, pges Springer Verlg, A.Blumer, J.Blumer, D.Hussler, A.Ehrenfeucht, M.T.Chen, nd J.Seifers. The smllest utomtion recognizing the suwords of text. Theoreticl Computer Science, 40:31-55, M. Burrows nd D. J. Wheeler. A lock-sorting lossless dt compression lgorithm. DEC SRC Reserch Report 124, M. Crochemore nd C. Hncrt. Automt for mtching ptterns. In G. Rozenerg nd A. Slom, editors, Hndook of Forml Lnguges, volume 2, Liner Modeling: Bckground nd Appliction, chpter 9, pges Springer-Verlg, P. Ferrgin nd G. Mnzini. Opportunistic dt structures with pplictions. In Proceedings of the 4lst Annul IEEE Symposium on Foundtions of Computer Science (FOCS), pges , G. Gonnet, R. Bez-Ytes, T. Snider, New indices for text: PAT trees nd PAT rrys, in: W. Frkes, R.A. Bez-Ytes (Eds.), Informtion Retrievl: Algorithms nd Dt Structures, Prentice-Hll, Englewood Cliffs, NJ, 1992, pp D. Gusfield. Algorithms on Strings Trees nd Sequences. Cmridge University- Press, New York, New York, R. Grossi nd J. Vitter. Compressed suffix rrys nd suffix trees with pplictions to text indexing nd string mtching. In Proceedings of the 32nd ACM Symposium on Theory of Computing (STOC), Meng He, J. In Munro, S. Srinivs Ro. A ctegoriztion theorem on suffix rrys with pplictions to spce efficient text indexes In SIAM Symposium on Discrete Algorithms (SODA), Pges: G. Jcoson. Succinct sttic dt structures. Technicl Report CMU-CS , Dept. of Computer Science, Crnegie-Mellon University, Jn U. Mner nd G. Myers. Suffix rrys: A new method for on-line string serches. SIAM Journl on Computing, 22: , Oct J.I. Munro nd V. Rmn, Succinct Representtion of Blnced Prentheses, Sttic Trees nd Plnr Grphs, Proc. 38th Annul IEEE Symp. on Foundtions of Computer Science, Octoer 1997, pges J. I. Munro. Tles. In Proceedings of the 16th ry Conference on Foundtions of Softwre Technology nd Computer Science (FSTTCS 96), LNCS 1180, pges 37 42, K. Sdkne. Compressed text dtses with efficient query lgorithms sed on the compressed suffix rrys. In Proc. 11th Interntionl Symposium on Algorithms nd Computtion, pges Springer-Verlg LNCS 1969, P. Weiner. Liner pttern mtching lgorithm. In Proc. 14th Annul IEEE Symposium on Switching nd Automt Theory, pges 1-11, Meng Zhng. Succinct Text Indexes on Lrge Alphet. Technicl Report, Jilin University, 2005.

Dynamic Fully-Compressed Suffix Trees

Dynamic Fully-Compressed Suffix Trees Motivtion Dynmic FCST s Conclusions Dynmic Fully-Compressed Suffix Trees Luís M. S. Russo Gonzlo Nvrro Arlindo L. Oliveir INESC-ID/IST {lsr,ml}@lgos.inesc-id.pt Dept. of Computer Science, University of

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

The size of subsequence automaton

The size of subsequence automaton Theoreticl Computer Science 4 (005) 79 84 www.elsevier.com/locte/tcs Note The size of susequence utomton Zdeněk Troníček,, Ayumi Shinohr,c Deprtment of Computer Science nd Engineering, FEE CTU in Prgue,

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014 CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

Harvard University Computer Science 121 Midterm October 23, 2012

Harvard University Computer Science 121 Midterm October 23, 2012 Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is

More information

On Suffix Tree Breadth

On Suffix Tree Breadth On Suffix Tree Bredth Golnz Bdkoeh 1,, Juh Kärkkäinen 2, Simon J. Puglisi 2,, nd Bell Zhukov 2, 1 Deprtment of Computer Science University of Wrwick Conventry, United Kingdom g.dkoeh@wrwick.c.uk 2 Helsinki

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 utomt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Prolem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) nton Setzer (Bsed on ook drft y J. V. Tucker nd K. Stephenson)

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

Preview 11/1/2017. Greedy Algorithms. Coin Change. Coin Change. Coin Change. Coin Change. Greedy algorithms. Greedy Algorithms

Preview 11/1/2017. Greedy Algorithms. Coin Change. Coin Change. Coin Change. Coin Change. Greedy algorithms. Greedy Algorithms Preview Greed Algorithms Greed Algorithms Coin Chnge Huffmn Code Greed lgorithms end to e simple nd strightforwrd. Are often used to solve optimiztion prolems. Alws mke the choice tht looks est t the moment,

More information

Chapter 2 Finite Automata

Chapter 2 Finite Automata Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3 2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

CISC 4090 Theory of Computation

CISC 4090 Theory of Computation 9/6/28 Stereotypicl computer CISC 49 Theory of Computtion Finite stte mchines & Regulr lnguges Professor Dniel Leeds dleeds@fordhm.edu JMH 332 Centrl processing unit (CPU) performs ll the instructions

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

CHAPTER 1 Regular Languages. Contents

CHAPTER 1 Regular Languages. Contents Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

The maximal number of runs in standard Sturmian words

The maximal number of runs in standard Sturmian words The miml numer of runs in stndrd Sturmin words Pwe l Bturo Mrcin Piątkowski Wojciech Rytter Fculty of Mthemtics nd Computer Science Nicolus Copernicus University Institute of Informtics Wrsw University

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2016 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 9 1. (4pts) ((p q) (q r)) (p r), prove tutology using truth tles. p

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

Farey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University

Farey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University U.U.D.M. Project Report 07:4 Frey Frctions Rickrd Fernström Exmensrete i mtemtik, 5 hp Hledre: Andres Strömergsson Exmintor: Jörgen Östensson Juni 07 Deprtment of Mthemtics Uppsl University Frey Frctions

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz University of Southern Cliforni Computer Science Deprtment Compiler Design Fll Lexicl Anlysis Smple Exercises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sciences Institute 4676 Admirlty Wy, Suite

More information

Module 9: Tries and String Matching

Module 9: Tries and String Matching Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer

More information

Module 9: Tries and String Matching

Module 9: Tries and String Matching Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer

More information

Algorithms in Computational. Biology. More on BWT

Algorithms in Computational. Biology. More on BWT Algorithms in Computtionl Biology More on BWT tody Plese Lst clss! don't forget to submit And by next (vi emil, repo ) implementtion week or shre prgectfltw get Not I would like reding overview! Discuss

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

The area under the graph of f and above the x-axis between a and b is denoted by. f(x) dx. π O

The area under the graph of f and above the x-axis between a and b is denoted by. f(x) dx. π O 1 Section 5. The Definite Integrl Suppose tht function f is continuous nd positive over n intervl [, ]. y = f(x) x The re under the grph of f nd ove the x-xis etween nd is denoted y f(x) dx nd clled the

More information

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont. NFA DFA Exmple 3 CMSC 330: Orgniztion of Progrmming Lnguges NFA {B,D,E {A,E {C,D {E Finite Automt, con't. R = { {A,E, {B,D,E, {C,D, {E 2 Equivlence of DFAs nd NFAs Any string from {A to either {D or {CD

More information

Review of Gaussian Quadrature method

Review of Gaussian Quadrature method Review of Gussin Qudrture method Nsser M. Asi Spring 006 compiled on Sundy Decemer 1, 017 t 09:1 PM 1 The prolem To find numericl vlue for the integrl of rel vlued function of rel vrile over specific rnge

More information

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018 CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA

More information

Fingerprint idea. Assume:

Fingerprint idea. Assume: Fingerprint ide Assume: We cn compute fingerprint f(p) of P in O(m) time. If f(p) f(t[s.. s+m 1]), then P T[s.. s+m 1] We cn compre fingerprints in O(1) We cn compute f = f(t[s+1.. s+m]) from f(t[s.. s+m

More information

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton 25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q

More information

Foundations of XML Types: Tree Automata

Foundations of XML Types: Tree Automata 1 / 43 Foundtions of XML Types: Tree Automt Pierre Genevès CNRS (slides mostly sed on slides y W. Mrtens nd T. Schwentick) University of Grenole Alpes, 2017 2018 2 / 43 Why Tree Automt? Foundtions of XML

More information

New Expansion and Infinite Series

New Expansion and Infinite Series Interntionl Mthemticl Forum, Vol. 9, 204, no. 22, 06-073 HIKARI Ltd, www.m-hikri.com http://dx.doi.org/0.2988/imf.204.4502 New Expnsion nd Infinite Series Diyun Zhng College of Computer Nnjing University

More information

Regular Languages and Applications

Regular Languages and Applications Regulr Lnguges nd Applictions Yo-Su Hn Deprtment of Computer Science Yonsei University 1-1 SNU 4/14 Regulr Lnguges An old nd well-known topic in CS Kleene Theorem in 1959 FA (finite-stte utomton) constructions:

More information

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS CS 310 (sec 20) - Winter 2003 - Finl Exm (solutions) SOLUTIONS 1. (Logic) Use truth tles to prove the following logicl equivlences: () p q (p p) (q q) () p q (p q) (p q) () p q p q p p q q (q q) (p p)

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

CSE : Exam 3-ANSWERS, Spring 2011 Time: 50 minutes

CSE : Exam 3-ANSWERS, Spring 2011 Time: 50 minutes CSE 260-002: Exm 3-ANSWERS, Spring 20 ime: 50 minutes Nme: his exm hs 4 pges nd 0 prolems totling 00 points. his exm is closed ook nd closed notes.. Wrshll s lgorithm for trnsitive closure computtion is

More information

Random subgroups of a free group

Random subgroups of a free group Rndom sugroups of free group Frédérique Bssino LIPN - Lortoire d Informtique de Pris Nord, Université Pris 13 - CNRS Joint work with Armndo Mrtino, Cyril Nicud, Enric Ventur et Pscl Weil LIX My, 2015 Introduction

More information

Homework Solution - Set 5 Due: Friday 10/03/08

Homework Solution - Set 5 Due: Friday 10/03/08 CE 96 Introduction to the Theory of Computtion ll 2008 Homework olution - et 5 Due: ridy 10/0/08 1. Textook, Pge 86, Exercise 1.21. () 1 2 Add new strt stte nd finl stte. Mke originl finl stte non-finl.

More information

2. Udi Manber, Gene Myers: Suffix arrays: A new method for online string searching, SIAM Journal on Computing 22:935-48,1993

2. Udi Manber, Gene Myers: Suffix arrays: A new method for online string searching, SIAM Journal on Computing 22:935-48,1993 7 Suffix rrys This exposition ws developed y Clemens Gröpl nd Knut Reinert. It is sed on the following sources, which re ll recommended reding: 1. Simon J. Puglisi, W. F. Smyth, nd Andrew Turpin, A txonomy

More information

Theoretical foundations of Gaussian quadrature

Theoretical foundations of Gaussian quadrature Theoreticl foundtions of Gussin qudrture 1 Inner product vector spce Definition 1. A vector spce (or liner spce) is set V = {u, v, w,...} in which the following two opertions re defined: (A) Addition of

More information

Finite-State Automata: Recap

Finite-State Automata: Recap Finite-Stte Automt: Recp Deepk D Souz Deprtment of Computer Science nd Automtion Indin Institute of Science, Bnglore. 09 August 2016 Outline 1 Introduction 2 Forml Definitions nd Nottion 3 Closure under

More information

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary Outline Genetic Progrmming Evolutionry strtegies Genetic progrmming Summry Bsed on the mteril provided y Professor Michel Negnevitsky Evolutionry Strtegies An pproch simulting nturl evolution ws proposed

More information

1 APL13: Suffix Arrays: more space reduction

1 APL13: Suffix Arrays: more space reduction 1 APL13: Suffix Arrys: more spce reduction In Section??, we sw tht when lphbet size is included in the time nd spce bounds, the suffix tree for string of length m either requires Θ(m Σ ) spce or the minimum

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

Some Theory of Computation Exercises Week 1

Some Theory of Computation Exercises Week 1 Some Theory of Computtion Exercises Week 1 Section 1 Deterministic Finite Automt Question 1.3 d d d d u q 1 q 2 q 3 q 4 q 5 d u u u u Question 1.4 Prt c - {w w hs even s nd one or two s} First we sk whether

More information

CS 330 Formal Methods and Models

CS 330 Formal Methods and Models CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2017 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 2 1. Prove ((( p q) q) p) is tutology () (3pts) y truth tle. p q p q

More information

New data structures to reduce data size and search time

New data structures to reduce data size and search time New dt structures to reduce dt size nd serch time Tsuneo Kuwbr Deprtment of Informtion Sciences, Fculty of Science, Kngw University, Hirtsuk-shi, Jpn FIT2018 1D-1, No2, pp1-4 Copyright (c)2018 by The Institute

More information

Lecture Solution of a System of Linear Equation

Lecture Solution of a System of Linear Equation ChE Lecture Notes, Dept. of Chemicl Engineering, Univ. of TN, Knoville - D. Keffer, 5/9/98 (updted /) Lecture 8- - Solution of System of Liner Eqution 8. Why is it importnt to e le to solve system of liner

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

Linear Inequalities. Work Sheet 1

Linear Inequalities. Work Sheet 1 Work Sheet 1 Liner Inequlities Rent--Hep, cr rentl compny,chrges $ 15 per week plus $ 0.0 per mile to rent one of their crs. Suppose you re limited y how much money you cn spend for the week : You cn spend

More information

Solving the String Statistics Problem in Time O(n log n)

Solving the String Statistics Problem in Time O(n log n) Alcom-FT Technicl Report Series ALCOMFT-TR-02-55 Solving the String Sttistics Prolem in Time O(n log n) Gerth Stølting Brodl 1,,, Rune B. Lyngsø 3, Ann Östlin1,, nd Christin N. S. Pedersen 1,2, 1 BRICS,

More information

19 Optimal behavior: Game theory

19 Optimal behavior: Game theory Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,

More information

Model Reduction of Finite State Machines by Contraction

Model Reduction of Finite State Machines by Contraction Model Reduction of Finite Stte Mchines y Contrction Alessndro Giu Dip. di Ingegneri Elettric ed Elettronic, Università di Cgliri, Pizz d Armi, 09123 Cgliri, Itly Phone: +39-070-675-5892 Fx: +39-070-675-5900

More information

Formal Language and Automata Theory (CS21004)

Formal Language and Automata Theory (CS21004) Forml Lnguge nd Automt Forml Lnguge nd Automt Theory (CS21004) Khrgpur Khrgpur Khrgpur Forml Lnguge nd Automt Tle of Contents Forml Lnguge nd Automt Khrgpur 1 2 3 Khrgpur Forml Lnguge nd Automt Forml Lnguge

More information

Myhill-Nerode Theorem

Myhill-Nerode Theorem Overview Myhill-Nerode Theorem Correspondence etween DA s nd MN reltions Cnonicl DA for L Computing cnonicl DFA Myhill-Nerode Theorem Deepk D Souz Deprtment of Computer Science nd Automtion Indin Institute

More information

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages 5//6 Grmmr Automt nd Lnguges Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Prof. Mohmed Hmd Softwre Engineering L. The University of Aizu Jpn Regulr Lnguges Context Free Lnguges Context Sensitive

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014 CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

The Regulated and Riemann Integrals

The Regulated and Riemann Integrals Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue

More information

Lecture 3 ( ) (translated and slightly adapted from lecture notes by Martin Klazar)

Lecture 3 ( ) (translated and slightly adapted from lecture notes by Martin Klazar) Lecture 3 (5.3.2018) (trnslted nd slightly dpted from lecture notes by Mrtin Klzr) Riemnn integrl Now we define precisely the concept of the re, in prticulr, the re of figure U(, b, f) under the grph of

More information

(e) if x = y + z and a divides any two of the integers x, y, or z, then a divides the remaining integer

(e) if x = y + z and a divides any two of the integers x, y, or z, then a divides the remaining integer Divisibility In this note we introduce the notion of divisibility for two integers nd b then we discuss the division lgorithm. First we give forml definition nd note some properties of the division opertion.

More information

7.2 The Definite Integral

7.2 The Definite Integral 7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where

More information

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers 80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES 2.6 Finite Stte Automt With Output: Trnsducers So fr, we hve only considered utomt tht recognize lnguges, i.e., utomt tht do not produce ny output on ny input

More information

DFA Minimization and Applications

DFA Minimization and Applications DFA Minimiztion nd Applictions Mondy, Octoer 15, 2007 Reding: toughton 3.12 C235 Lnguges nd Automt Deprtment of Computer cience Wellesley College Gols for ody o Answer ny P3 questions you might hve. o

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016 CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010 CS 311 Homework 3 due 16:30, Thursdy, 14 th Octoer 2010 Homework must e sumitted on pper, in clss. Question 1. [15 pts.; 5 pts. ech] Drw stte digrms for NFAs recognizing the following lnguges:. L = {w

More information

Closure Properties of Regular Languages

Closure Properties of Regular Languages Closure Properties of Regulr Lnguges Regulr lnguges re closed under mny set opertions. Let L 1 nd L 2 e regulr lnguges. (1) L 1 L 2 (the union) is regulr. (2) L 1 L 2 (the conctention) is regulr. (3) L

More information

Section 4: Integration ECO4112F 2011

Section 4: Integration ECO4112F 2011 Reding: Ching Chpter Section : Integrtion ECOF Note: These notes do not fully cover the mteril in Ching, ut re ment to supplement your reding in Ching. Thus fr the optimistion you hve covered hs een sttic

More information

Faster Regular Expression Matching. Philip Bille Mikkel Thorup

Faster Regular Expression Matching. Philip Bille Mikkel Thorup Fster Regulr Expression Mtching Philip Bille Mikkel Thorup Outline Definition Applictions History tour of regulr expression mtching Thompson s lgorithm Myers lgorithm New lgorithm Results nd extensions

More information

Applicable Analysis and Discrete Mathematics available online at

Applicable Analysis and Discrete Mathematics available online at Applicble Anlysis nd Discrete Mthemtics vilble online t http://pefmth.etf.rs Appl. Anl. Discrete Mth. 4 (2010), 23 31. doi:10.2298/aadm100201012k NUMERICAL ANALYSIS MEETS NUMBER THEORY: USING ROOTFINDING

More information

Generalized Fano and non-fano networks

Generalized Fano and non-fano networks Generlized Fno nd non-fno networks Nildri Ds nd Brijesh Kumr Ri Deprtment of Electronics nd Electricl Engineering Indin Institute of Technology Guwhti, Guwhti, Assm, Indi Emil: {d.nildri, bkri}@iitg.ernet.in

More information

Where did dynamic programming come from?

Where did dynamic programming come from? Where did dynmic progrmming come from? String lgorithms Dvid Kuchk cs302 Spring 2012 Richrd ellmn On the irth of Dynmic Progrmming Sturt Dreyfus http://www.eng.tu.c.il/~mi/cd/ or50/1526-5463-2002-50-01-0048.pdf

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

Lexical Analysis Finite Automate

Lexical Analysis Finite Automate Lexicl Anlysis Finite Automte CMPSC 470 Lecture 04 Topics: Deterministic Finite Automt (DFA) Nondeterministic Finite Automt (NFA) Regulr Expression NFA DFA A. Finite Automt (FA) FA re grph, like trnsition

More information

DFA minimisation using the Myhill-Nerode theorem

DFA minimisation using the Myhill-Nerode theorem DFA minimistion using the Myhill-Nerode theorem Johnn Högerg Lrs Lrsson Astrct The Myhill-Nerode theorem is n importnt chrcteristion of regulr lnguges, nd it lso hs mny prcticl implictions. In this chpter,

More information

Section 6.1 INTRO to LAPLACE TRANSFORMS

Section 6.1 INTRO to LAPLACE TRANSFORMS Section 6. INTRO to LAPLACE TRANSFORMS Key terms: Improper Integrl; diverge, converge A A f(t)dt lim f(t)dt Piecewise Continuous Function; jump discontinuity Function of Exponentil Order Lplce Trnsform

More information

NFAs continued, Closure Properties of Regular Languages

NFAs continued, Closure Properties of Regular Languages Algorithms & Models of Computtion CS/ECE 374, Fll 2017 NFAs continued, Closure Properties of Regulr Lnguges Lecture 5 Tuesdy, Septemer 12, 2017 Sriel Hr-Peled (UIUC) CS374 1 Fll 2017 1 / 31 Regulr Lnguges,

More information

Exercises Chapter 1. Exercise 1.1. Let Σ be an alphabet. Prove wv = w + v for all strings w and v.

Exercises Chapter 1. Exercise 1.1. Let Σ be an alphabet. Prove wv = w + v for all strings w and v. 1 Exercises Chpter 1 Exercise 1.1. Let Σ e n lphet. Prove wv = w + v for ll strings w nd v. Prove # (wv) = # (w)+# (v) for every symol Σ nd every string w,v Σ. Exercise 1.2. Let w 1,w 2,...,w k e k strings,

More information

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

Converting Regular Expressions to Discrete Finite Automata: A Tutorial Converting Regulr Expressions to Discrete Finite Automt: A Tutoril Dvid Christinsen 2013-01-03 This is tutoril on how to convert regulr expressions to nondeterministic finite utomt (NFA) nd how to convert

More information

List all of the possible rational roots of each equation. Then find all solutions (both real and imaginary) of the equation. 1.

List all of the possible rational roots of each equation. Then find all solutions (both real and imaginary) of the equation. 1. Mth Anlysis CP WS 4.X- Section 4.-4.4 Review Complete ech question without the use of grphing clcultor.. Compre the mening of the words: roots, zeros nd fctors.. Determine whether - is root of 0. Show

More information

Math 4310 Solutions to homework 1 Due 9/1/16

Math 4310 Solutions to homework 1 Due 9/1/16 Mth 4310 Solutions to homework 1 Due 9/1/16 1. Use the Eucliden lgorithm to find the following gretest common divisors. () gcd(252, 180) = 36 (b) gcd(513, 187) = 1 (c) gcd(7684, 4148) = 68 252 = 180 1

More information