Title. Author(s) 髙木, 拓也. Issue Date DOI. Doc URL. Type. File Information. Studies on Efficient Index Construction for Multiple
|
|
- Shanna Johnston
- 5 years ago
- Views:
Transcription
1 Title Studies on Efficient Index Construction for Multiple Author(s) 髙木, 拓也 Issue Dte DOI /doctorl.k13077 Doc URL Type theses (doctorl) File Informtion Tkuy_Tkgi.pdf Instructions for use Hokkido University Collection of Scholrly nd Ac
2 Studies on Efficient Index Construction for Multiple nd Repetitive Texts ( ) Tkuy Tkgi Jnury 2018 Division of Computer Science nd Informtion Technology Grdute School of Informtion Science nd Technology Hokkido University
3
4 Astrct Text indexing prolem is one of the fundmentl prolems in computer science nd the im is to construct n efficient dt structure tht nswers queries such s text pttern mtching. For the lst decdes, there hs een n incresing mount of multiple texts such s dt generted from multiple sensors nd repetitive texts such s genome sequence collections. For exmple, the GeoLife Project collects trjectories from GPS loggers tht hve vriety of smpling rtes. These trjectories were recorded every 1 to 5 seconds or every 5 to 10 meters per point. For nother exmple, the 1000 Genomes Project collects the humn genomes from vrious groups. Since ech genome informtion is similr to ech other, the sme sustructures pper repetedly in this genome dtse. These projects re iming t dt nlysis, informtion retrievl, nd dt mining for text informtion. For pttern mtching, which is the most fundmentl query for texts, we cn nswer queries y using sic text pttern mtching lgorithms such s Knuth-Morris-Prtt (KMP) lgorithm nd Boyer-Moore (BM) lgorithm. Since these lgorithms scn the texts for ech query, it requires t lest liner time for dtse size in one query. In order to quickly process these dt, preprocessing nd indexing re importnt. For exmple, the suffix tree, one of the sic text indexes, cn support pttern mtching in liner time for pttern length. Therefore, uilding n efficient index structure is the key to processing these lrge mounts of text informtion. In this thesis, we show efficient index construction lgorithms for text dt. For multiple texts nd repetitive texts, there re severl prolems with indexing. 1
5 Since dt grow constntly for multiple sensor dt such s GPS trjectories, it is necessry for the index to support online construction for multiple texts. For repetitive texts tht is similr text collection such s genome sequences, we should e le to uild n index with more compressed size. In order to solve these prolems, we propose severl new index structures nd construction lgorithms. In prticulr, this thesis dels with speeding up construction nd opertions of indexes, online construction of indexes for multiple texts, nd construction of compressed indexes for texts including long repetitions. In Chpter 3, we propose fster version of leled trees (compct tries) clled pcked compct tries, y using it-prllel method. By doing this, we show fster construction of text indexes such s suffix trees nd fster vrious opertions like prefix serch, insertion, nd deletion. Since the compct trie is widely used dt structure, we cn speed up some lgorithms y using pcked compct tries. In prticulr, we show tht LZ-doule fctoriztion which is one kind of text compression lgorithm is speeded up. In Chpter 4, we first defined fully-online construction prolem, which is setting tht llows new input symol cn e dded n ritrry string of the set of input strings. To solve this prolem, we first showed fully-online construction lgorithm of DAG index clled the directed cyclic word grph (DAWG). We lso proposed fully-online construction lgorithm for the suffix tree using similrity etween DAWGs nd suffix trees. In Chpter 5, we proposed self-indexing method y comining n index clled the compct directed cyclic word grph (CDAWG) with grmmr compression, which is one of the compression methods. When the input text is compressile, the index cn e held with size smller thn the originl text. In Chpter 6, we give conclusions nd future work. Overll, we studied efficient lgorithms for text index construction in this thesis. 2
6 Contents 1 Introduction Bckground Reserch gols Summry of the results Contriutions of this thesis Preliminries Nottions on Strings Nottions on grphicl indexes Suffix tries Suffix trees Directed cyclic word grphs (DAWGs) Dulity of suffix trees nd DAWGs Compct directed cyclic word grphs (CDAWGs) Pcked Compct Tries Bckground Relted work Preliminries Compct tries
7 3.2.2 Dynmic predecessor dt structures Pcked dynmic compct tries Micro dynmic compct tries for short strings Pcked dynmic compct tries for long strings Micro trie decomposition Speeding-up with hshing Applictions to online string processing Preliminry experiments Conclusions of Chpter Fully-online Construction of Suffix trees for Multiple Texts Bckground Relted work Preliminries Suffix trees nd DAWGs for multiple texts Fully-online text collection Fully-online version of DAWG nd Weiner s suffix tree lgorithm Semi-online construction of Weiner s suffix trees nd DAWGs Fully-online construction of Weiner s suffix trees nd DAWGs Fully-online version of Ukkonen s suffix tree lgorithm Semi-online left-to-right suffix tree construction Difficulties in fully-online left-to-right suffix tree construction Fully-online left-to-right suffix tree lgorithms Conclusions of Chpter Liner-size Compct Directed Acyclic Word Grphs Bckground Preliminries
8 5.2.1 LSTrie Stright-line progrms The proposed dt structure: L-CDAWG Outline Constructing type-2 nodes nd edge suffix links Construction of the SLP for L-CDAWG The min result Conclusions of Chpter Conclusions nd Future Work Summry of the results Future work
9
10 Chpter 1 Introduction 1.1 Bckground For the lst decdes, there hs een n incresing mount of unstructured dt such s genetic dt, logging dt, nd We nd SNS texts, which hve een coined s ig dt. Most of these unstructured dt re ville in the form of text informtion. Therefore, there re demnds for lgorithms nd dt structure tht cn efficiently hndle these ig unstructured dt. Multiple growing texts nd repetitive texts re one of the fetures of text dt such s logging dt nd genetic dt. Multiple growing texts re text set tht cn e ppended new symol t the end of text in the set. There re mny text dt with this feture in the rel world. For exmple, due to the rpid development of network nd sensor technologies, vrious nd enormous strem dt re generted from multiple source such s GPS trjectory dt [72], sensor nd Twitter strems. These re represented s multiple texts or multiple sequences tht re constntly growing. Another feture of textul ig dt is clled repetitiveness [56]. It mens kind of text sets consisting of similr texts. For exmple, genome sequences [21] nd versioned document collections such s softwre repositories re one of the highly repetitive texts. 7
11 These dt contin mny long repetitions in the text. 1.2 Reserch gols In order to use ig dt, it is necessry to perform vrious queries such s dt mining nd informtion retrievl. However, ecuse of the mssive mount of dt, even simple queries such s text pttern mtching tke too much time. One of the solutions is to preprocess those dt nd crete n index tht supports the query in order to nswer quickly. Among indexes for texts, those indexes tht hve ll sustring informtion of ech text supports the most diverse queries. In this thesis, we study efficient index construction for multiple texts nd repetitive texts. There re the following demnds for construction of indexes with these text dt. First, in order to process lrge mount of dt t high speed, we wnt n index tht supports fst queries. Second, in order to construct n index for multiple growing texts, we need n index tht enles online construction for multiple texts. Finlly, to store lrge mount of dt, it must e smll in its size. 1.3 Summry of the results In this thesis, there re three min results to text indexing s follows. In Chpter 2, we introduce nottions nd definitions of some dt structures. In Chpter 3, we study ccelertion of compct tries using the pcked string technique. The dynmic compct trie [42, 65] is fundmentl dt structure for storing set of vrile-length strings. It cn store set of k strings over n lphet Σ with totl size n in O(n log n) its of spce. we propose pcked compct tries tht support fster prefix serch queries nd updte opertions of compct tries on the stndrd word RAM model. It still keeps n log σ + O(k log n) its of spce. 8
12 In Chpter 4, we study fully-online construction of DAWG nd suffix trees for multiple texts. Let T = {T 1,..., T K } e collection of texts. By fully-online, we men tht new chrcter cn e ppended to ny text in T t ny time. This is nturl generliztion of semi-online construction of indexing dt structures for multiple texts in which, fter new chrcter is ppended to the k-th text T k, then its previous texts T 1,..., T k 1 will remin sttic. We propose fully-online lgorithms which construct the directed cyclic word grph (DAWG) [14], nd the generlized suffix tree (GST ) [42] for T in O(n log σ) time nd O(n) words of spce, where n nd σ denote the totl length of texts in T nd the lphet size, respectively. In Chpter 5, we study compressed index comining CDAWGs nd grmmr compression. Recent studies hve shown tht the compct directed cyclic word grphs (CDAWG) [15] topology chieves the compressed size for repeted strings. However, there is no known method for supporting high-speed serch with the compressed size without hving the originl input string. Liner-size CDAWG proposed in this thesis chieves the compressed size while supporting serch time similr to originl CDAWG. In Chpter 6, we give the summry of this thesis, nd then discuss possile future reserches. 1.4 Contriutions of this thesis We studied three fundmentl prolems which re necessry when we construct the index tht cn efficiently hndle mssive mount of text dt. A verstile text index hs three fetures: high speed queries, fully-online construction, nd, smll spce complexity. Ech result of this thesis shows n index which chieves one of the three fetures. First, s sis of efficient text indexes llowing high speed query processing, we proposed n improved dt structure supporting high speed construction nd queries y using it-prllel methods. Secondly, for multiple growing texts like strem dt 9
13 from multiple sensors, we proposed construction lgorithm of n index in fully-online mnner. Thirdly, for texts tht contin mny repetitive structures, we proposed n index tht cn cpture the repeting structure nd store it in compressed size. Overll, we studied efficient lgorithms for text index construction which re sis to chieve n index with the three fetures. 10
14 Chpter 2 Preliminries In this chpter, we introduce sic definitions nd nottions in strings, suffix tries, suffix trees, directed cyclic word grphs, nd compct directed cyclic word grphs ccording to [24 26, 42]. 2.1 Nottions on Strings Let Σ e n ordered lphet. Any element of Σ is clled string. For ny string T, let T denote its length. Let ε e the empty string, nmely, ε = 0. If T = XY Z, then X, Y, nd Z re clled prefix, sustring, nd suffix of T, respectively. For ny 1 i j T, let T [i..j] denote the sustring of T tht egins t position i nd ends t position j in T. For ny 1 i T, let T [i] denote the ith chrcter of T. For ny string T, let Suffix(T ) denote the set of suffixes of T, nd for ny set T of strings, let Suffix(T ) denote the set of suffixes of ll strings in T. Nmely, Suffix(T ) = T T Suffix(T ). For ny string T, let T denote the reversed string of T, i.e., T = T [ T ] T [1]. Let T = {T 1,..., T K } e collection of K texts. For ny 1 k K, let lrs T (T k ) e the longest repeting suffix of T k tht occurs t lest twice in T. For ny strings 11
15 X, Y, LCP(X, Y ) denotes the longest common prefix of X nd Y. Throughout this thesis, the se of the logrithms will e 2, unless otherwise stted. For ny integers i j, [i, j] denotes the intervl {i, i + 1,..., j}. Our model of computtion is the stndrd word RAM of word size w = log n its. For simplicity, we ssume tht w is multiple of log σ, so α = log σ n letters re pcked in single word. Since we cn red w its in constnt time, we cn red nd process α consecutive letters in constnt time. 2.2 Nottions on grphicl indexes All index structures delt with in this thesis, such s suffix tries, suffix trees, CDAWGs, liner-size suffix tries (LSTries), nd liner-size CDAWGs (L-CDAWGs), re grphicl indexes in the sense tht n index is pointer-sed structure uilt on n underlying DAG G L = (V (L), E(L)) with root r V (L) nd mpping l : E(L) Σ + tht ssign lel idl(e) to ech edge e E(L). For n edge e = (u, v) E(L), we denote its end points y e.hi := u nd e.lo := v, respectively. The lel string of e is idl(e) Σ +. The string length of e is idslen(e) := idl(e) 1. An edge is clled tomic if idslen(e) = 1, nd thus, idl(e) Σ. For pth p = (e 1,..., e k ) of length k 1, we extend its end points, lel string, nd string length y p.hi := e 1.hi, p.lo := e k.lo, idl(p) := idl(e 1 )... idl(e k ) Σ +, nd idslen(p) := idslen(e 1 ) + + idslen(e k ) 1, respectively. 2.3 Suffix tries The suffix trie for text collection T = {T 1,..., T K }, denoted STrie(T ), is trie which represents Suffix(T ). The size of STrie(T ) is O(n 2 ), where n is the totl length of texts in T. We identify ech node v of STrie(T ) with the string tht v represents. 12
16 Suffix Trie c c Suffix Tree c c c c c c c c c c c Pth compction Minimiztion CDAWG c c c DAWG c Minimiztion c c c c c Pth compction Figure 2.1: Illustrtion of STrie(T ), STree(T ), DAWG(T ), nd CDAWG(T ) with T = c. The solid rrows nd roken rrows represent the edges nd the suffix links of ech dt structure, respectively. A sustring x of text in T is sid to e rnching in T, if there exist two distinct chrcters, Σ such tht oth x nd x re sustrings of some texts in T. Clerly, node x of STrie(T ) is rnching iff x is rnching in T. For ech node v of STrie(T ) with Σ nd v Σ, let slink(v) = v. This uxiliry edge slink(v) = v from v to v is clled suffix link. We define the reversed suffix link W (v) = v iff slink(v) = v. For ny node v nd Σ, if v is not sustring of the texts in T, then W (v) is undefined. By definition, the reversed suffix links on STrie(T ) form rooted tree which coincides with STrie(T ), the suffix trie for the collection T = {T 1,..., T K } of the reversed texts. 13
17 2.4 Suffix trees The suffix tree [68] for text collection T, denoted STree(T ), is compcted trie which represents Suffix(T ). STree(T ) is otined y compcting every pth of STrie(T ) which consists of non-rnching internl nodes (see Fig. 2.1). Since every internl node of STree(T ) is rnching, nd since there re t most n leves in STree(T ), the numers of edges nd nodes re O(n). The edge lels of STree(T ) re non-empty sustrings of some text in T. By representing ech edge lel x with triple k, i, j of integers s.t. x = T k [i..j], STree(T ) cn e stored with O(n) spce. We sy tht ny rnching (resp. non-rnching) sustring of T is n explicit node (resp. implicit node) of STree(T ). An implicit node x is represented y triple (v,, l), clled reference to x, such tht v is n explicit ncestor of x, is the first chrcter of the pth from v to x, nd l is the length of the pth from v to x. A reference (v,, l) to node x is clled cnonicl if v is the lowest explicit ncestor of x. For ech explicit node v of STree(T ) with Σ nd v Σ, let slink(v) = v. For ech explicit node v nd Σ, we lso define the reversed suffix link W (v) = vx where x Σ is the shortest string such tht vx is n explicit node of STree(T ). W (v) is undefined if v is not sustring of texts in T. These reversed suffix links re lso clled s Weiner links (or W-link in short) in the literture [16]. A W-link W (v) = vx is sid to e hrd if x = ε, nd soft if x Σ +. Let w e Boolen function such tht for ny explicit node v nd Σ, w (v) = 1 iff (soft or hrd) W-link W (v) exists. Notice tht if w (v) = 1 for node v nd Σ, then w (u) = 1 for every ncestor of v. 2.5 Directed cyclic word grphs (DAWGs) The directed cyclic word grph (DAWG in short) [14,15] of text collection T, denoted DAWG(T ), is smllest DAG which represents Suffix(T ). DAWG(T ) is otined y 14
18 merging identicl sutrees of STrie(T ) connected y the suffix links (see Fig. 2.1). Hence, the lel of every edge of DAWG(T ) is single chrcter. The numers of nodes nd edges of DAWG(T ) re O(n) [15], nd hence DAWG(T ) cn e stored with O(n) spce. DAWG(T ) cn e defined formlly s follows: For ny string x, let Epos T (x) e the set of ending positions of x in the texts in T, i.e., Epos T (x) = {(k, j) x = T k [j x + 1..j], 1 j T k, 1 k K}. Consider n equivlence reltion T on sustrings x, y of texts in T such tht x T y iff Epos T (x) = Epos T (y). For ny sustring x of texts of T, let [x] T denote the equivlence clss w.r.t. T. There is one-to-one correspondence etween ech node v of DAWG(T ) nd ech equivlence clss [x] T, nd hence we will identify ech node v of DAWG(T ) with its corresponding equivlence clss [x] T. Let long([x] T ) denote the longest memer of [x] T. By the definition of equivlence clsses, long([x] T ) is unique for ech [x] T nd every memer of [x] T is suffix of long([x] T ). If x, x re sustrings of some text in T with x Σ nd Σ, then there exists n edge leled with chrcter Σ from node [x] T to node [x] T. This edge is clled primry if long([x] T ) + 1 = long([x] T ), nd is clled secondry otherwise. For ech node [x] T of DAWG(T ) with x 1, let slink([x] T ) = y, where y is the longest suffix of long([x] T ) which does not elong to [x] T. In the exmple of Fig. 2.1, [] T = {, }. The edge leled with from node [] T to node [] T is primry, while the edge leled with from [] T to node [] T is secondry. slink([] T ) = [] T. 2.6 Dulity of suffix trees nd DAWGs There exists nice dulity etween suffix trees nd DAWGs. To oserve this, it is convenient to consider the collection T of the reversed texts ech of which egins with specil mrker $ i, i.e., T = {$ 1 T 1,..., $ K T K }. For ese of nottion, let S k = T k for 15
19 1 k K nd S = {$ 1 S 1,..., $ K S K } = T. Then, it is known (c.f. [14, 15, 25]) tht the reversed suffix links of DAWG(S) coincide with the suffix tree STree(T ) for the originl text collection T. This fct cn lso e oserved from the other direction. Nmely, the hrd (resp. soft) W-links of STree(T ) coincide with the primry (resp. secondry) edges of DAWG(S). Intuitively, this dulity holds ecuse (1) The reversed suffix links of STrie(S) form STrie(T ) (nd vice vers), nd (2) When we construct DAWG(S) from STrie(S), we merge isomorphic sutrees tht re connected y suffix links. During this merging process, the reversed suffix links get compcted nd the resulting compcted links form the edges of STree(T ). Using this dulity, we cn immeditely show tht the totl numer of hrd nd soft W-links is liner in the totl text length n, since the numer of edges of the DAWG is liner in n. This lso mens tht we cn esily mintin the Boolen indictor w with O(n) spce, so tht w (v) for given node v nd Σ cn e nswered in O(log σ) time (e.g., t ech node v we cn mintin BST storing only the chrcters c s.t. w c (v) = 1.) 2.7 Compct directed cyclic word grphs (CDAWGs) The compct directed cyclic word grph [15, 26] for text T, denoted CDAWG(T ), is the miniml compct utomton which represents Suffix(T ). CDAWG(T ) cn e otined from STree(T $) y merging isomorphic sutrees nd deleting ssocited endmrker $ Σ. Since CDAWG(T ) is n edge-leled DAG, we represent directed edge from node u to v with lel string x Σ + y triple f = (u, x, v). For ny node u, the lel strings of out-going edges from u strt with mutully distinct chrcters. 16
20 Formlly, CDAWG(T ) is defined s follows. For ny strings x, y, we denote x L y (resp. x R y) iff the eginning positions (resp. ending positions) of x nd y in T re equl. Let [x] L (resp. [x] R ) denote the equivlence clss of strings w.r.t. L (resp. R ). All strings tht re not sustrings of T form single equivlence clss, nd in the sequel we will consider only the sustrings of T. Let x (resp. x ) denote the longest memer of the equivlence clss [x] L (resp. [x] R ). Notice tht ech memer of [x] L (resp. [x] R ) is prefix of x (resp. suffix of x ). Let x = ( x ) = ( x ). We denote x y iff x = y, nd let [x] denote the equivlence clss w.r.t.. The longest memer of [x] is x nd we will lso denote it y vlue([x]). We define CDAWG(T ) s n edge-leled DAG (V, E) such tht V = {[ x ] R x is sustring of T } nd E = {([ x ] R, α, [ x α] R ) α Σ +, x x α}. The opertor corresponds to compcting non-rnching edges (like conversion from STrie(T ) to STree(T )) nd the [ ] R opertor corresponds to merging isomorphic sutrees of STree(T ). For simplicity, we use nottion so tht when we refer to node of CDAWG(T ) s [x], this implies x = x nd [x] = [ x ] R. Let [x] e ny node of CDAWG(T ) nd consider the suffixes of vlue([x]) which correspond to the suffix tree nodes tht re merged when trnsformed into the CDAWG. We define the suffix link of node [x] y slink([x]) = [y], iff y is the longest suffix of vlue([x]) tht does not elong to [x]. It is shown tht ll nodes of CDAWG(T ) except the sink correspond to the mximl repets of T. Actully, vlue([x]) is mximl repet in T [58]. Following this fct, one cn esily see tht the numers of edges of CDAWG(T ) nd CDAWG(T ) coincide with the numers e r T nd el T respectively [9, 58]. of right- nd left- extensions of mximl repets of T, By representing ech edge lel α with pirs (i, j) of integers such tht T [i..j] = α, CDAWG(T ) cn e stored in O(e r T log n + n log σ) its of spce. 17
21
22 Chpter 3 Pcked Compct Tries In this chpter, we present new dt structure clled the pcked compct trie (pcked c-trie) which stores set S of k strings of totl length n in n log σ + O(k log n) its of spce nd supports fst pttern mtching queries nd updtes, where σ is the lphet size. Assume tht α = log σ n letters re pcked in single mchine word on the stndrd word RAM model, nd let f(k, n) denote the query nd updte times of the dynmic predecessor/successor dt structure of our choice which stores k integers from universe [1, n] in O(k log n) its of spce. Then, given string of length m, our pcked c-tries support pttern mtching queries nd insert/delete opertions in O( m f(k, n)) α worst-cse time nd in O( m α + f(k, n)) expected time. Our experiments show tht our pcked c-tries re fster thn the stndrd compct tries (.k.. Ptrici trees) on rel dt sets. As n ppliction of our pcked c-trie, we show tht the sprse suffix tree for string of length n over prefix codes with k smpled positions, such s evenly-spced nd word delimited sprse suffix trees, for set of k word suffixes cn e constructed online in O(( n + k)f(k, n)) worst-cse time nd O( n + kf(k, n)) α α expected time with n log σ + O(k log n) its of spce. When k = O( n ), y using the α stte-of-the-rt dynmic predecessor/successor dt structures, we otin su-liner time construction lgorithms using only O( n ) its of spce in oth cses. α 19 We lso
23 discuss n ppliction of our pcked c-tries to online LZD fctoriztion. 3.1 Bckground The trie for set S of strings of totl length n is clssicl dt structure which occupies O(n log n + n log σ) its of spce nd llows for prefix serch nd insertion/deletion for given string of length m in O(m log σ) time, where σ is the lphet size. compct trie for S is pth-compressed trie where the edges in every non-rnching pth re merged into single edge [53]. By representing ech edge lel y pir of positions in string in S, the compct trie cn e stored in n log σ + O(k log n) its of spce, where k is the numer of strings in S, retining the sme time efficiency for prefix serch nd insertion/deletion for given string. Thus, compct tries hve widely een used in numerous pplictions such s dynmic dictionry mtching [44], suffix trees [68], sprse suffix trees [47], externl string indexes [30], nd grmmr-sed text compression [39]. In this chpter, we show how to ccelerte prefix serch queries nd updte opertions of compct tries on the stndrd word RAM model with mchine word size w = log n, still keeping n log σ + O(k log n)-it spce usge. A sic ide is to use the pcked string mtching pproch [12], where α = log σ n consecutive letters re pcked in single word nd cn e mnipulted in O(1) time. In this setting, we cn red given pttern P of length m in O( m ) time, ut, during the trversl of P over com- α pct trie, there cn e t most m rnching nodes. Thus, nïve implementtion of compct trie tkes O( m + m log σ) = O(m log σ) time even in the pcked mtching log σ n setting. To overcome the ove difficulty, we propose how to quickly process long nonrnching pths using it mnipultions, nd how to quickly process dense rnching sutrees using fst predecessor/successor queries nd dictionry look-ups. As result, 20 The
24 we otin new compct trie clled the pcked compct trie (pcked c-trie) for dynmic set S of strings with the following efficiency: Theorem 1 (min result) Let f(k, n) e the query/updte times of n ritrry dynmic predecessor/successor dt structure using O(k log n) its of spce for dynmic set of k integers from the universe [1, n]. Our pcked c-trie stores set S of k strings of totl length n in n log σ + O(k log n) its of spce nd supports prefix serch nd insertion/deletion for given string of length m in O( m f(k, n)) worst-cse time or in α O( m α + f(k, n)) expected time. Using Beme nd Fich s dt structure [6] or Willrd s y-fst trie [70] s the dynmic predecessor/successor dt structure, we otin the following corollry: Corollry 2 There exists pcked c-trie for dynmic set S of strings which uses n log σ+o(k log n) its of spce, nd supports prefix serch nd insert/delete opertions for given string of length m in O( m α log log n) expected time. log log k log log n log log log n ) worst-cse time or in O( m α + Unlike most other (compct) tries, our pcked c-trie does not mintin dictionry or serch structure for the children of ech node. Insted, we prtition our c-trie into h/α levels, where h is the length of the longest string in S. Then ech sutree of height α, clled micro c-trie, mintins predecessor/successor dictionry tht processes prefix serch inside the micro c-trie. A reduction from prefix serch to predecessor/successor queries ws lredy considered in n erlier work y Cole et l. [19], however, their dt structure is sttic. On the other hnd, our micro c-tries re dynmic. A similr technique to our pcked c-trie ws used in the linked dynmic uncompcted trie y Jnsson et l. [46]. Our experiments show tht our pcked c-tries re fster thn Ptrici trees for oth construction nd prefix serch in lmost ll dt sets we tested. 21
25 We show tht our pcked c-tries cn e pplied to efficient online construction of evenly sprse suffix trees [47], word suffix trees [45] nd its extension [64]. Also, pcked c-tries cn e used for online computtion of the LZ-Doule fctoriztion [39] (LZDF ), stte-of-the-rt online grmmr-sed text compressor. We lso show two pplictions to our pcked c-tries. The first ppliction is online construction of evenly sprse suffix trees [47], word suffix trees [45] nd its extension [64]. The existing lgorithms for these sprse suffix trees tke O(n log σ) worst-cse time using n log σ + O(k log n) its of where k is the numer of suffixes stored in the output sprse suffix tree. Using our pcked c-tries, we chieve O(( n α + k) log log k log log n log log log n ) worst-cse construction time nd O( n + k log log n) expected construction time. The α former is suliner in n when k = O( n ) nd σ = polylog(n), the ltter is suliner in α n when k = o( n log log n ) nd σ = polylog(n). To chieve these results, we show tht in our pcked c-trie, prefix serches nd insertion opertions cn e strted not only from the root ut from ny node. This cpility is necessry for online sprse suffix tree construction, since during the suffix link trversl we hve to insert new leves from non-root internl nodes. The second ppliction is online computtion of the LZ-Doule fctoriztion [39] (LZDF ), stte-of-the-rt online grmmr-sed text compressor. Goto et l. [39] presented Ptrici-tree sed lgorithm which computes the LZDF of given string T of length n in O(k(M + min{k, M} log σ)) worst-cse time using O(n log σ) its of spce, where k n is the numer of fctors nd M n is the length of the longest fctor. Using our pcked c-tries, we chieve good expected performnce with O(k( M α + f(k, n))) time for LZDF Relted work Belzzougui et l. [7] proposed rndomized compct trie clled the signed dynmic z-fst trie, which stores dynmic set S of k strings in n log σ + O(k log n) its of 22
26 spce. Given string of length m, the signed dynmic z-fst trie supports prefix serch in O( m + log m) worst-cse time only with high proility, nd supports insert/delete α opertions in O( m + log m) expected time only with high proility.1 On the other α hnd, our pcked c-trie lwys return the correct nswer for prefix serch, nd lwys insert/delete given string correctly, in the ounds stted in Theorem 1 nd Corollry 2. Andersson nd Thorup [3] proposed the exponentil serch tree which uses n log σ + O(k log n) its of spce, nd supports prefix serch nd insert/delete opertions in O(m + ) worst-cse time. Ech node v of the exponentil serch tree stores log k log log k constnt-time look-up dictionry for some children of v nd dynmic predecessor/successor dt structure for the other children of v. This implies tht given string of length m, t most m nodes in the serch pth for the string must e processed one y one, nd hence pcking α = log σ n letters in single word does not seem to speed-up the exponentil serch tree. Fischer nd Gwrychowski s wexponentil serch tree [33] proposed uses n log σ + O(k log n) its of spce, nd supports prefix serch nd insert/delete opertions in O(m + (log log σ)2 ) worst-cse time. When σ = polylog(n), our pcked c-trie chieves log log log σ log σ log log k log log n O(m log n log log log n ) = O(m (log log n)2 log n log log log n wexponentil serch tree requires O(m + ) = O(o(1)m) worst-cse time, while the (log log log n)2 log log log log n ) time2. 1 The O(log m) expected ound for insertion/deletion stted in [7] ssumes tht the prefix serch for the string hs lredy een performed. 2 For sufficiently long ptterns of length m = Θ(n), our pcked c-trie chieves worst-cse suliner o(n) time while the wexponentil serch tree requires O(n) time. 23
27 3.2 Preliminries Compct tries Let S = {X 1,..., X k } e set of k non-empty strings of totl length n. We consider dynmic dt structures for S llowing for fst prefix serches of given ptterns over strings in S, nd fst insertion/deletion of strings to/from S. Suppose S is prefix-free. The trie of S is tree s.t. ech edge is leled y single letter, the lels of the out edges of ech node re distinct, nd for ech X i S there is unique lef l i s.t. the pth from the root to l i spells out X i. The compct trie T S of S is pth-compressed trie otined y contrcting nonrnching pths into single edges. Nmely, in T S, ech edge is leled y non-empty sustring of T, ech internl node hs t lest two children, the out-going edges from ech node egin with distinct letters, nd ech edge lel x is encoded y triple i,, such tht x = X i [..] for some 1 i k nd 1 X i. The length of n edge e, denoted e, is the length of its lel string. Let root(t S ) denote the root of the compct trie T S. For ny node v, let prent(v) denotes its prent. For convenience, let e n uxiliry node s.t. prent(root(t S )) =. We ssume the edge from to root(t S ) is leled y n ritrry letter. For ny node v, let str(v) denotes the string otined y conctenting the edge lels from the root to v. Ech node v stores str(v). Let s e prefix of ny string in S. Let v e the shllowest node of T S such tht s is suffix of str(v) (notice s cn e equl to str(v)), nd let u = prent(v). The locus of string s in T S is pir ϕ = (e, h), where e is the edge from u to v nd h is the offset from u, nmely, h = s str(u). 3 We extend the str function to locus ϕ, so tht str(ϕ) = s. The string depth of locus ϕ is d(ϕ) = str(ϕ). A string P is recognized y 3 In the literture the locus is represented y (u, c, h) where c is the first letter of the lel of e. Since our pcked c-trie does not mintin serch structure for rnches, we represent the locus directly on e. 24
28 T S iff there is locus ϕ with str(ϕ) = P. We consider the following query nd opertions on dynmic compct tries. LPS(ϕ, P ): Given locus in T S nd pttern string P, it returns the locus ˆϕ of string str(ϕ)q in T S, where Q is the longest prefix of P for which str(ϕ)q is recognized y T S. When ϕ = ((, root(t S )), 1), then the query is known s the longest prefix serch for the pttern P in the compct trie. Insert(ϕ, X): Given locus ϕ in T S nd string X, it inserts new lef which corresponds to new string str(ϕ)x S into the compct trie, from the given locus ϕ. When there is no node t the locus ˆϕ = LPS(ϕ, X), then new node is creted t ˆϕ s the prent of the lef. When ϕ = ((, root(t S )), 1), then this is stndrd insertion of string X to T S. Delete(X i ): Given string X i S, it deletes the lef l i. If the out-degree of the prent v of l i ecomes 1 fter the deletion of l i, then the in-coming nd out-going edges of v re merged into single edge, nd v is lso deleted Dynmic predecessor dt structures. For dynmic set I [1, n] of k integers of w = log n its ech, dynmic predecessor dt structures (e.g., [6, 7, 71]) efficiently support predecessor query Pred(X) = mx({y I Y X} {0}), successor query Succ(X) = min({y I Y X} {n + 1}), nd insert/delete opertions for I. Theorem 3 Let f(k, n) e the time complexity of for predecessor/successor queries nd insert/delete opertions of n ritrry dynmic predecessor/successor dt structure which occupies O(k log n) its of spce. Beme nd Fich s dt structure [6] chieves f(k, n) = O( (log log k)(log log n) ) worst-cse time. log log log n Theorem 4 Let f(k, n) e the time complexity of for predecessor/successor queries nd insert/delete opertions of n ritrry dynmic predecessor/successor dt structure 25
29 which occupies O(k log n) its of spce. Willrd s Y-fst trie [70] chieves f(k, n) = O(log log n) expected time. 3.3 Pcked dynmic compct tries This section presents our new dynmic compct tries clled the pcked dynmic compct tries (pcked c-tries) for dynmic set S = {X 1,..., X k } of k strings of totl length n, which chieves the min result in Theorem 1. In the sequel, string X Σ is clled short if X α = log σ n, nd is clled long if X > α Micro dynmic compct tries for short strings. In this susection, we present our dt structure storing short strings. Our input is dynmic set S = {X 1,..., X k } of k strings of totl length n, such tht X i α = log σ n for every 1 i k. Hence it holds tht k σ α = n. For simplicity, we ssume for now tht X i = α for every 1 i k. The generl cse where S contins strings shorter thn α will e explined lter in Remrk 1. The dynmic dt structure for short strings, clled micro c-trie nd denoted MT S, consists of the following: (i) A dynmic compct trie of height exctly α storing the set S. Let N e the set of internl nodes, nd let L = {l 1,..., l k } e the set of k leves such tht l i corresponds to X i for 1 i k. Since every internl node is rnching, N k 1. Every node v of MT S corresponds to the string str(v) of log n its. Overll, this compct trie requires n log σ + O(k log n) its of spce (including S). (ii) A dynmic predecessor/successor dt structure D which stores the set S = {X 1,..., X k } of strings in O(k log n) its of spce, where ech X i is regrded s log n- it integer. D supports predecessor/successor queries nd insert/delete opertions in f(k, n) time ech. Clerly MT S requires n log σ + O(k log n) its of totl spce. The next lemm shows how to support in O(1) time LCP queries for strings repre- 26
30 sented y two given nodes on the dynmic micro c-trie MT S. This is relted to the leling scheme (e.g., see [1]) which ssigns short lel to ech node so tht lter, given the lels of two nodes, the lel of the LCA of the nodes cn e nswered in O(1) time. Although the sttic tree is considered in the leling scheme, our micro c-trie is dynmic. Also, our lgorithm is much simpler thn pplying the dynmic LCA dt structure [20] to our micro c-tries. Lemm 1 For ny nodes u nd v of the dynmic micro c-trie MT S, we cn compute LCP(str(u), str(v)) in O(1) time. Proof 1 We pd str(u) nd/or str(v) with n ritrry letter c so they ecome α long ech, nmely, let P = str(u)c α str(u) nd Q = str(v)c α str(v). We compute the most significnt it (ms) of the XOR of the it representtions of P nd Q. Let the it position of the ms, nd let z = ( 1)/ log σ. W.l.o.g. ssume str(u) str(v). (1) If z < str(u), then str(u)[1, z] = LCP(str(u), str(v)). In this cse, there exists rnching node y such tht str(y) = str(u)[1, z], nd hence LCP(str(u), str(v)) = str(y). (2) If z str(u), then str(u) = LCP(str(u), str(v)), nd hence str(u) = LCP(str(u), str(v)). Since ech of P nd Q is stored in single mchine word, we cn compute the XOR of P nd Q in O(1) time. The ms cn e computed in O(1) time using the technique of Fredmn nd Willrd [35]. This completes the proof. On micro c-tries, prefix serches nd insertion opertions cn e strted not only from the root ut from ny node. This is necessry for online sprse suffix tree construction sed on Ukkonen s lgorithm [65], since during the suffix link trversl we hve to insert new leves from non-root internl nodes. Theorem 5 The micro c-trie MT S supports LPS(ϕ, X) queries in O(f(k, n)) time. 27
31 Proof 2 Let P e the prefix of str(ϕ)x of length α, i.e., P = str(ϕ)x[1..α d(phi)]. The cse where P is represented y lef is esy, nd thus, in wht follows we focus on the cse where P is not represented y lef. First, we compute the string depth d = d(ϕ) [0, α]. Oserve tht d = mx{ LCP(P, Pred(P )), LCP(P, Succ(P )) }. Given P, we compute Pred(P ) nd Succ(P ) in O(f(k, n)) time. Then, we cn compute LCP(P, Pred(P )) in O(1) time y computing the ms of the XOR of the it representtions of P nd Pred(P ), s in Lemm 1. LCP(P, Succ(P )) cn e computed nlogously, nd thus, d = d(ϕ) cn e computed in O(f(k, n)) time. Second, we locte e = (u, v). See lso Fig Let Z = P [1, d]. Let LB = Zc α Z 1 nd UB = Zc α Z σ e the lexicogrphiclly lest nd gretest strings of length α with prefix Z, respectively. To locte u in MT S, we find the leftmost nd rightmost leves X L nd X R elow ϕ y X L = Succ(LB) nd X R = Pred(UB). Then, the longer one of LCP(X L 1, X L ) nd LCP(X R, X R+1 ) corresponds to the origin node u of e, nd LCP(X L, X R ) corresponds to the destintion node v of e. These LCPs cn e computed in O(1) time y Lemm 1. Wht remins is how to ccess the nodes u nd v representing these strings. In so doing, let $ e specil chrcter tht does not pper in ny strings in S. For ech string Y represented y n internl node of MT S, we pd $ t the end of Y so its length ecomes exctly α, nmely, we otin Y $ α Y. We insert this pdded string into dynmic dictionry dedicted only for internl nodes (here we use predecessor/successor dt structure). Now, given string represented y n internl node, we cn ccess the corresponding node in O(f(k, n)) time. Finlly we otin ϕ = ((u, v), d str(u) ) in overll O(f(k, n)) time. It follows from the proof of Theorem 5 tht dynmic predecessor/successor dt structure is enough to support pttern mtching queries on our dynmic micro c-tire. This implies tht we do not hve to store (the triples for) the edge lels in the micro c-trie. This oservtion is importnt when we consider delete opertions on the set S, s we will see in the next lemm. 28
32 micro c-trie LCA(l L-1, l L ) LCA(l R, l R+1 ) φ^ φ = root X[1..d] l L-1 X L l L l R X R l R+1 Figure 3.1: Given the initil locus ϕ (which is on the root in this figure) nd query pttern P = , the lgorithm of Theorem 5 nswers the LPS(ϕ, P ) query on the micro c-trie s in this figure. The nswer to the query is the locus ˆϕ for P [1..5] = Lemm 2 The micro c-trie MT S supports Insert(ϕ, X) nd Delete(X) opertions in O(f(k, n)) time. We ssume tht d(ϕ) + X α so tht the height of the micro compct trie will lwys e kept within α. Proof 3 We show how to support Insert(ϕ, X) in O(f(k, n)) time. Initilly S =, the micro compct trie MT S consists only of root(mt S ), nd predecessor/successor dictionry D contins no elements. When the first string X is inserted to S, then we crete lef elow the root nd insert X to D. Suppose tht the dt structure mintins string set S with S 1. To insert string X from the given locus ϕ, we first conduct the LPS(ϕ, X) query of Theorem 5, nd let ˆϕ = (e, h) e the nswer to the query. If h = e, then we simply insert new lef l from the destintion node of e. Otherwise, we split e t ˆϕ nd crete new node v there s the prent of the new lef, such tht str(v) = str( ˆϕ). The rest is the sme s in the former cse. After the new lef is inserted, we insert str(ϕ)x to D in O(f(k, n)) time. We consider Delete(X). Recll tht ech edge of the micro c-trie does not store 29
33 α α α α 0α 1α 2α 3α 4α Figure 3.2: Micro-trie decomposition: The pcked c-trie is decomposed into numer of micro c-tries (gry rectngles) ech of which is of height α = log σ n. Ech micro-trie is equipped with dynmic predecessor/successor dt structure. the triple representing its string lel. Thnks to this property, we need not consider updtes of the lels of the edges in the pth from the root to the deleted lef (which usully ecomes prolemtic in compct tries). Thus, we cn support Delete(X) in similr wy to Insert(ϕ, X), in O(f(k, n)) time. Remrk 1 When d(ϕ) + X < α, then we cn support Insert(ϕ, X) nd LPS(ϕ, X) s follows. When inserting X, we pd X with specil letter $ which does not pper in S. Nmely, we perform Insert(ϕ, X) opertion with X = X$ α d(ϕ) X. When computing LPS(ϕ, X), we pd X with nother specil letter # $ which does not pper in S. Nmely, we perform LPS(ϕ, X ) query with X = X# α d(ϕ) X. This gives us the correct locus for LPS(ϕ, X) Pcked dynmic compct tries for long strings. In this susection, we present the pcked dynmic compct trie (pcked c-trie) PT S for set S of vrile-length strings of length t most O(2 w ) = O(n). 30
34 3.3.3 Micro trie decomposition. We decompose PT S into numer of micro c-tries. See lso Fig Let h > α e the length of the longest string in S. We ctegorize the nodes of PT S into h/α +1 levels: We sy tht node of PT S is t level i (0 i h/α ) iff str(v) [iα, (i + 1)α 1]. The level of node v is denoted y level(v). A locus ϕ of PT S is clled oundry iff d(ϕ) is multiple of α. Consider ny pth from root(pt S ) to lef, nd ssume tht there is no node t some oundry kα on this pth. We crete n uxiliry node t tht oundry on this pth, iff there is t lest one non-uxiliry (i.e., originl) node t level i 1 or i + 1 on this pth. Let BN denote the set of nodes t the oundries, clled the oundry nodes. For ech oundry node v BN, we crete micro compct trie MT whose root root(mt ) is v, internl nodes re ll descendnts u of v with level(u) = level(v), nd leves re ll oundry descendnts l of v with level(l) = level(v) + 1. Notice tht ech oundry node is the root of micro c-trie t its level nd is lso lef of micro c-trie t the previous level. An edge is sid to e long edge iff its lel is t lest α long. We store the lel of ech long edge y triple of integers. Recll tht, on the other hnd, we do not store (encodings) of the edge lels in the micro c-tries. Lemm 3 The pcked c-trie PT S for prefix-free set S of k strings requires n log σ + O(k log n) its of spce. Proof 4 Firstly, we show the numer of uxiliry oundry nodes in PT S. At most 2 uxiliry oundry nodes re creted on ech originl edge of PT S. Since there re t most 2k 2 originl edges, the totl numer of uxiliry oundry nodes is t most 4k 4. Since there re t most 2k 1 originl nodes in PT S, the totl numer of nodes in PT S is t most 6k 5. Clerly, the totl numer of short strings of length t most α mintined y the micro c-tries is no more thn the numer of ll nodes in PT S. The 31
35 numer of long edges in PT S is no more thn the numer of its nodes. Overll, the totl spce of PT S is n log σ + O(k log n) its. For ny locus ϕ on PT S, ld(ϕ) denotes the locl string depth of ϕ in the micro c-trie MT tht contins ϕ. Nmely, if root(mt ) = v, the prent of u in PT S is u, nd e = (u, v), then ld(ϕ) = d(ϕ) d((e, e )). Prefix serch queries nd insert/delete opertions cn e supported y our pcked c-trie, s follows. Lemm 4 The pcked c-trie PT S supports LPS(ϕ, P ) query in O( m f(k, n)) worst-cse α time, where m = P > α. Proof 5 If m + ld(ϕ) α, the ound immeditely follows from Theorem 5. Assume m + ld(ϕ) > α, nd let q = α ld(ϕ) + 1. We fctorize P into h + 1 locks s p 0 = P [1, q 1], p 1 = P [q, q + α 1],..., p h 1 = P [q + (h 1)α, q + hα 1], nd p h = P [q + hα, m], where 1 p 0 α, p i = α for 1 i h 1, nd 1 p h α. Ech lock cn e computed in O(1) time y stndrd it opertions. If there is mismtch in p 0, we re done. Otherwise, for ech i in incresing order from 1 to h, we perform LPS(γ, p i ) query from the root γ of the corresponding micro c-trie t ech level of the corresponding pth strting from ϕ. This continues until we find either the first mismtch for some i or complete mtches for ll i s. Ech LPS query with ech micro c-trie tkes O(f(k, n)) time y Theorem 5. Since h = O( m), it tkes O( m f(k, n)) totl α α time. Lemm 5 The pcked c-trie PT S supports Insert(ϕ, X) nd Delete(X i ) opertions in O( m f(k, n)) worst-cse time, where m = X > α. α Proof 6 Insert(ϕ, X): we first perform LPS(ϕ, X) in O( m f(k, n)) time (Lemm 4). α Let x 0,..., x h e the fctoriztion of X w.r.t. ϕ, nd let x j e the lock of the fctoriztion contining the first mismtch. Then, we conduct Insert(γ, x j ) opertion on the corresponding micro c-trie, where γ is its root. It tkes O(f(k, n)) time (Lemm 2). 32
36 If j = h (x j is the lst lock in the fctoriztion of X), then we re done. Otherwise, we crete new edge with lel x jx j+1 x k, where x j is the suffix of X j which egins t the mismtched position, leding to the new lef l. We crete new oundry node if necessry. These opertions tke O(1) time ech. Hence, Insert(ϕ, X) tkes O( m f(k, n)) totl time. α Delete(X i ): Let Q e the pth from the root r of PT S to lef l i. If l i is child of the root of PT S, then we simply delete the single edge in Q. Otherwise, for ech su-pth of Q tht elongs to micro c-trie, we perform Delete opertion of Lemm 2 in this micro c-trie. Since the pth Q spns t most m α micro c-tries, the delete opertions on these micro c-tries tke O( m f(k, n)) totl time. For ech long edge in Q whose lel α refers to X i, let i,, e the triple representing the lel. We replce the triple with i,,, where X i is the predecessor of X i in S nd X i [.. ] = X[..] (if X i does not hve predecessor, then we cn use the successor of S insted). We cn find X i s follows. First, we compute ϕ = LPS(r, X i ) = LCA(l i, l i ). Then, we cn find l i y trversing the right-most pth from ϕ tht is to the left of the su-pth of Q from ϕ to l i. This cn e done in O( m α f(k, n)) time. The positions nd in X i cn e computed y simple rithmetics, since we know the totl length of the lels in the pth from ϕ to l i. Since the pth Q contins less thn m α edges in Q cn e updted in O( m α ) time. long edges, the triples for ll long Speeding-up with hshing. By ugmenting ech micro c-trie with hsh tle storing the short strings, we chieve good expected performnce, s follows: Lemm 6 The pcked c-trie PT S ugmented with hshing supports LPS(ϕ, X) query, Insert(ϕ, X) nd Delete(X) opertions in O( m α + f(k, n)) expected time. Proof 7 Let MT e ny micro c-trie in the pcked c-trie PT S, nd M the set of 33
37 strings mintined y MT ech eing of length t most α. We store ll strings of M in hsh tle ssocited to MT, which supports look-ups, insertions nd deletions in O(1) expected time. Let x 0,..., x h e the fctoriztion of X w.r.t. ϕ. To perform LPS(ϕ, X), we sk if str(ϕ)x 0 is in the hsh tle of the corresponding micro c-trie. If the nswer is no, the first mismtch occurs in x 0, nd the rest is the sme s in Lemm 4. If the nswer is yes, then for ech i from 1 to h in incresing order, we sk if x i is in the hsh tle of the corresponding micro c-trie, until we receive the first no with some i or we receive yes for ll i s. In the ltter cse, we re done. In the former cse, we perform LPS query with x i from the root of the corresponding micro c-trie. Since we perform t most one LPS query nd O( m) look-ups for hsh tles, it tkes O( m +f(k, n)) expected α α time. O( m + f(k, n)) expected time ounds for Insert(ϕ, X) nd Delete(X) immeditely α follow from the ove rguments. 3.4 Applictions to online string processing Sprse suffix trees. The suffix tree [68] of string T of length n is compct trie which stores ll n suffixes of T. A sprse suffix tree for set K [1, n] of smpled positions of T is compct trie which stores only the suset S = {T [i..n] i K} of the suffixes of T eginning t the smpled positions in K. It is known tht if the set K of smpled positions stisfy some properties (e.g., every r positions for some fixed r > 1 or the positions immeditely fter the word delimiters), the sprse suffix tree cn e constructed in n online mnner in O(n log σ) time nd n log σ + O(n log n) its of spce [45, 47, 64]. Pcked c-tries cn speed up online construction nd pttern mtching for these sprse suffix trees: Here ech input string X to Insert is given s pir (i, j) of positions in T s.t. X = T [i..j]. As Lemm 7 sttes, Insert opertion in such cse cn e 34
Minimal DFA. minimal DFA for L starting from any other
Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA
More informationDesigning finite automata II
Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of
More informationConvert the NFA into DFA
Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:
More informationDynamic Fully-Compressed Suffix Trees
Motivtion Dynmic FCST s Conclusions Dynmic Fully-Compressed Suffix Trees Luís M. S. Russo Gonzlo Nvrro Arlindo L. Oliveir INESC-ID/IST {lsr,ml}@lgos.inesc-id.pt Dept. of Computer Science, University of
More information1 Nondeterministic Finite Automata
1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you
More informationCoalgebra, Lecture 15: Equations for Deterministic Automata
Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined
More informationp-adic Egyptian Fractions
p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction
More informationCompiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz
University of Southern Cliforni Computer Science Deprtment Compiler Design Fll Lexicl Anlysis Smple Exercises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sciences Institute 4676 Admirlty Wy, Suite
More informationCMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014
CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA
More informationFormal Languages and Automata
Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University
More information1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.
York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech
More informationNondeterminism and Nodeterministic Automata
Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely
More informationTypes of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt
More information12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016
CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple
More informationParse trees, ambiguity, and Chomsky normal form
Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs
More informationHarvard University Computer Science 121 Midterm October 23, 2012
Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic
More informationTypes of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt
More informationChapter 2 Finite Automata
Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht
More informationFormal languages, automata, and theory of computation
Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm
More informationCS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University
CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted
More informationCS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)
CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts
More informationIntermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4
Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one
More informationCS 275 Automata and Formal Language Theory
CS 275 utomt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Prolem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) nton Setzer (Bsed on ook drft y J. V. Tucker nd K. Stephenson)
More informationSurface maps into free groups
Surfce mps into free groups lden Wlker Novemer 10, 2014 Free groups wedge X of two circles: Set F = π 1 (X ) =,. We write cpitl letters for inverse, so = 1. e.g. () 1 = Commuttors Let x nd y e loops. The
More informationHomework 3 Solutions
CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.
More informationI1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3
2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is
More informationModel Reduction of Finite State Machines by Contraction
Model Reduction of Finite Stte Mchines y Contrction Alessndro Giu Dip. di Ingegneri Elettric ed Elettronic, Università di Cgliri, Pizz d Armi, 09123 Cgliri, Itly Phone: +39-070-675-5892 Fx: +39-070-675-5900
More informationLecture 09: Myhill-Nerode Theorem
CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives
More informationNFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.
NFA DFA Exmple 3 CMSC 330: Orgniztion of Progrmming Lnguges NFA {B,D,E {A,E {C,D {E Finite Automt, con't. R = { {A,E, {B,D,E, {C,D, {E 2 Equivlence of DFAs nd NFAs Any string from {A to either {D or {CD
More informationModule 9: Tries and String Matching
Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer
More informationModule 9: Tries and String Matching
Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer
More informationLecture 08: Feb. 08, 2019
4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny
More informationWhere did dynamic programming come from?
Where did dynmic progrmming come from? String lgorithms Dvid Kuchk cs302 Spring 2012 Richrd ellmn On the irth of Dynmic Progrmming Sturt Dreyfus http://www.eng.tu.c.il/~mi/cd/ or50/1526-5463-2002-50-01-0048.pdf
More informationLecture 3: Equivalence Relations
Mthcmp Crsh Course Instructor: Pdric Brtlett Lecture 3: Equivlence Reltions Week 1 Mthcmp 2014 In our lst three tlks of this clss, we shift the focus of our tlks from proof techniques to proof concepts
More informationAlignment of Long Sequences. BMI/CS Spring 2016 Anthony Gitter
Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostt.wisc.edu Gols for Lecture Key concepts how lrge-scle lignment differs from the simple cse the
More informationCS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata
CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or
More informationConnected-components. Summary of lecture 9. Algorithms and Data Structures Disjoint sets. Example: connected components in graphs
Prm University, Mth. Deprtment Summry of lecture 9 Algorithms nd Dt Structures Disjoint sets Summry of this lecture: (CLR.1-3) Dt Structures for Disjoint sets: Union opertion Find opertion Mrco Pellegrini
More information5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.
Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.
More information1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.
York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech
More informationRegular expressions, Finite Automata, transition graphs are all the same!!
CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1
More informationFarey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University
U.U.D.M. Project Report 07:4 Frey Frctions Rickrd Fernström Exmensrete i mtemtik, 5 hp Hledre: Andres Strömergsson Exmintor: Jörgen Östensson Juni 07 Deprtment of Mthemtics Uppsl University Frey Frctions
More informationFormal Languages and Automata Theory. D. Goswami and K. V. Krishna
Forml Lnguges nd Automt Theory D. Goswmi nd K. V. Krishn Novemer 5, 2010 Contents 1 Mthemticl Preliminries 3 2 Forml Lnguges 4 2.1 Strings............................... 5 2.2 Lnguges.............................
More informationTutorial Automata and formal Languages
Tutoril Automt nd forml Lnguges Notes for to the tutoril in the summer term 2017 Sestin Küpper, Christine Mik 8. August 2017 1 Introduction: Nottions nd sic Definitions At the eginning of the tutoril we
More informationFirst Midterm Examination
24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet
More informationAUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton
25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q
More informationCS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS
CS 310 (sec 20) - Winter 2003 - Finl Exm (solutions) SOLUTIONS 1. (Logic) Use truth tles to prove the following logicl equivlences: () p q (p p) (q q) () p q (p q) (p q) () p q p q p p q q (q q) (p p)
More informationClosure Properties of Regular Languages
Closure Properties of Regulr Lnguges Regulr lnguges re closed under mny set opertions. Let L 1 nd L 2 e regulr lnguges. (1) L 1 L 2 (the union) is regulr. (2) L 1 L 2 (the conctention) is regulr. (3) L
More informationContext-Free Grammars and Languages
Context-Free Grmmrs nd Lnguges (Bsed on Hopcroft, Motwni nd Ullmn (2007) & Cohen (1997)) Introduction Consider n exmple sentence: A smll ct ets the fish English grmmr hs rules for constructing sentences;
More informationThoery of Automata CS402
Thoery of Automt C402 Theory of Automt Tle of contents: Lecture N0. 1... 4 ummry... 4 Wht does utomt men?... 4 Introduction to lnguges... 4 Alphets... 4 trings... 4 Defining Lnguges... 5 Lecture N0. 2...
More informationThe Minimum Label Spanning Tree Problem: Illustrating the Utility of Genetic Algorithms
The Minimum Lel Spnning Tree Prolem: Illustrting the Utility of Genetic Algorithms Yupei Xiong, Univ. of Mrylnd Bruce Golden, Univ. of Mrylnd Edwrd Wsil, Americn Univ. Presented t BAE Systems Distinguished
More informationAssignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages
Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd
More informationBases for Vector Spaces
Bses for Vector Spces 2-26-25 A set is independent if, roughly speking, there is no redundncy in the set: You cn t uild ny vector in the set s liner comintion of the others A set spns if you cn uild everything
More informationPreview 11/1/2017. Greedy Algorithms. Coin Change. Coin Change. Coin Change. Coin Change. Greedy algorithms. Greedy Algorithms
Preview Greed Algorithms Greed Algorithms Coin Chnge Huffmn Code Greed lgorithms end to e simple nd strightforwrd. Are often used to solve optimiztion prolems. Alws mke the choice tht looks est t the moment,
More informationTable of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings...
Tle of contents: Lecture N0.... 3 ummry... 3 Wht does utomt men?... 3 Introduction to lnguges... 3 Alphets... 3 trings... 3 Defining Lnguges... 4 Lecture N0. 2... 7 ummry... 7 Kleene tr Closure... 7 Recursive
More informationBalanced binary search trees
02110 Inge Li Gørtz Overview Blnced binry serch trees: Red-blck trees nd 2-3-4 trees Amortized nlysis Dynmic progrmming Network flows String mtching String indexing Computtionl geometry Introduction to
More informationRevision Sheet. (a) Give a regular expression for each of the following languages:
Theoreticl Computer Science (Bridging Course) Dr. G. D. Tipldi F. Bonirdi Winter Semester 2014/2015 Revision Sheet University of Freiurg Deprtment of Computer Science Question 1 (Finite Automt, 8 + 6 points)
More informationName Ima Sample ASU ID
Nme Im Smple ASU ID 2468024680 CSE 355 Test 1, Fll 2016 30 Septemer 2016, 8:35-9:25.m., LSA 191 Regrding of Midterms If you elieve tht your grde hs not een dded up correctly, return the entire pper to
More informationGrammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages
5//6 Grmmr Automt nd Lnguges Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Prof. Mohmed Hmd Softwre Engineering L. The University of Aizu Jpn Regulr Lnguges Context Free Lnguges Context Sensitive
More informationCS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018
CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA
More informationThe size of subsequence automaton
Theoreticl Computer Science 4 (005) 79 84 www.elsevier.com/locte/tcs Note The size of susequence utomton Zdeněk Troníček,, Ayumi Shinohr,c Deprtment of Computer Science nd Engineering, FEE CTU in Prgue,
More informationChapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1
Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more
More informationFinite Automata-cont d
Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww
More informationSection 4: Integration ECO4112F 2011
Reding: Ching Chpter Section : Integrtion ECOF Note: These notes do not fully cover the mteril in Ching, ut re ment to supplement your reding in Ching. Thus fr the optimistion you hve covered hs een sttic
More information1.3 Regular Expressions
56 1.3 Regulr xpressions These hve n importnt role in describing ptterns in serching for strings in mny pplictions (e.g. wk, grep, Perl,...) All regulr expressions of lphbet re 1.Ønd re regulr expressions,
More informationState Minimization for DFAs
Stte Minimiztion for DFAs Red K & S 2.7 Do Homework 10. Consider: Stte Minimiztion 4 5 Is this miniml mchine? Step (1): Get rid of unrechle sttes. Stte Minimiztion 6, Stte is unrechle. Step (2): Get rid
More informationGenetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary
Outline Genetic Progrmming Evolutionry strtegies Genetic progrmming Summry Bsed on the mteril provided y Professor Michel Negnevitsky Evolutionry Strtegies An pproch simulting nturl evolution ws proposed
More informationFaster Regular Expression Matching. Philip Bille Mikkel Thorup
Fster Regulr Expression Mtching Philip Bille Mikkel Thorup Outline Definition Applictions History tour of regulr expression mtching Thompson s lgorithm Myers lgorithm New lgorithm Results nd extensions
More information3 Regular expressions
3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll
More information1 From NFA to regular expression
Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work
More informationExercises Chapter 1. Exercise 1.1. Let Σ be an alphabet. Prove wv = w + v for all strings w and v.
1 Exercises Chpter 1 Exercise 1.1. Let Σ e n lphet. Prove wv = w + v for ll strings w nd v. Prove # (wv) = # (w)+# (v) for every symol Σ nd every string w,v Σ. Exercise 1.2. Let w 1,w 2,...,w k e k strings,
More informationGNFA GNFA GNFA GNFA GNFA
DFA RE NFA DFA -NFA REX GNFA Definition GNFA A generlize noneterministic finite utomton (GNFA) is grph whose eges re lele y regulr expressions, with unique strt stte with in-egree, n unique finl stte with
More informationOn Suffix Tree Breadth
On Suffix Tree Bredth Golnz Bdkoeh 1,, Juh Kärkkäinen 2, Simon J. Puglisi 2,, nd Bell Zhukov 2, 1 Deprtment of Computer Science University of Wrwick Conventry, United Kingdom g.dkoeh@wrwick.c.uk 2 Helsinki
More informationSolving the String Statistics Problem in Time O(n log n)
Alcom-FT Technicl Report Series ALCOMFT-TR-02-55 Solving the String Sttistics Prolem in Time O(n log n) Gerth Stølting Brodl 1,,, Rune B. Lyngsø 3, Ann Östlin1,, nd Christin N. S. Pedersen 1,2, 1 BRICS,
More informationLecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations.
Lecture 3 3 Solving liner equtions In this lecture we will discuss lgorithms for solving systems of liner equtions Multiplictive identity Let us restrict ourselves to considering squre mtrices since one
More information12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014
CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple
More informationThe Regulated and Riemann Integrals
Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue
More informationSection 6.1 INTRO to LAPLACE TRANSFORMS
Section 6. INTRO to LAPLACE TRANSFORMS Key terms: Improper Integrl; diverge, converge A A f(t)dt lim f(t)dt Piecewise Continuous Function; jump discontinuity Function of Exponentil Order Lplce Trnsform
More informationCHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)
Finite Automt (FA or DFA) CHAPTER Regulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, equivlence of NFAs DFAs, closure under regulr
More informationCISC 4090 Theory of Computation
9/6/28 Stereotypicl computer CISC 49 Theory of Computtion Finite stte mchines & Regulr lnguges Professor Dniel Leeds dleeds@fordhm.edu JMH 332 Centrl processing unit (CPU) performs ll the instructions
More information2.4 Linear Inequalities and Interval Notation
.4 Liner Inequlities nd Intervl Nottion We wnt to solve equtions tht hve n inequlity symol insted of n equl sign. There re four inequlity symols tht we will look t: Less thn , Less thn or
More informationLecture 2: January 27
CS 684: Algorithmic Gme Theory Spring 217 Lecturer: Év Trdos Lecture 2: Jnury 27 Scrie: Alert Julius Liu 2.1 Logistics Scrie notes must e sumitted within 24 hours of the corresponding lecture for full
More informationdx dt dy = G(t, x, y), dt where the functions are defined on I Ω, and are locally Lipschitz w.r.t. variable (x, y) Ω.
Chpter 8 Stility theory We discuss properties of solutions of first order two dimensionl system, nd stility theory for specil clss of liner systems. We denote the independent vrile y t in plce of x, nd
More informationFinite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018
Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More
More informationQuadratic Forms. Quadratic Forms
Qudrtic Forms Recll the Simon & Blume excerpt from n erlier lecture which sid tht the min tsk of clculus is to pproximte nonliner functions with liner functions. It s ctully more ccurte to sy tht we pproximte
More informationAnatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute
Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn
More information20 MATHEMATICS POLYNOMIALS
0 MATHEMATICS POLYNOMIALS.1 Introduction In Clss IX, you hve studied polynomils in one vrible nd their degrees. Recll tht if p(x) is polynomil in x, the highest power of x in p(x) is clled the degree of
More informationFast Frequent Free Tree Mining in Graph Databases
The Chinese University of Hong Kong Fst Frequent Free Tree Mining in Grph Dtses Peixing Zho Jeffrey Xu Yu The Chinese University of Hong Kong Decemer 18 th, 2006 ICDM Workshop MCD06 Synopsis Introduction
More informationMyhill-Nerode Theorem
Overview Myhill-Nerode Theorem Correspondence etween DA s nd MN reltions Cnonicl DA for L Computing cnonicl DFA Myhill-Nerode Theorem Deepk D Souz Deprtment of Computer Science nd Automtion Indin Institute
More informationReview of Gaussian Quadrature method
Review of Gussin Qudrture method Nsser M. Asi Spring 006 compiled on Sundy Decemer 1, 017 t 09:1 PM 1 The prolem To find numericl vlue for the integrl of rel vlued function of rel vrile over specific rnge
More informationCS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions
CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2016 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 9 1. (4pts) ((p q) (q r)) (p r), prove tutology using truth tles. p
More informationHomework Solution - Set 5 Due: Friday 10/03/08
CE 96 Introduction to the Theory of Computtion ll 2008 Homework olution - et 5 Due: ridy 10/0/08 1. Textook, Pge 86, Exercise 1.21. () 1 2 Add new strt stte nd finl stte. Mke originl finl stte non-finl.
More informationDFA minimisation using the Myhill-Nerode theorem
DFA minimistion using the Myhill-Nerode theorem Johnn Högerg Lrs Lrsson Astrct The Myhill-Nerode theorem is n importnt chrcteristion of regulr lnguges, nd it lso hs mny prcticl implictions. In this chpter,
More informationarxiv: v1 [cs.ds] 19 Jul 2012
Efficient LZ78 fctoriztion of grmmr compressed text Hideo Bnni, Shunsuke Ineng, nd Msyuki Tked rxiv:1207.4607v1 [cs.ds] 19 Jul 2012 Deprtment of Informtics, Kyushu University {nni,ineng,tked}@inf.kyushu-u.c.jp
More informationCM10196 Topic 4: Functions and Relations
CM096 Topic 4: Functions nd Reltions Guy McCusker W. Functions nd reltions Perhps the most widely used notion in ll of mthemtics is tht of function. Informlly, function is n opertion which tkes n input
More informationFirst Midterm Examination
Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does
More informationW. We shall do so one by one, starting with I 1, and we shall do it greedily, trying
Vitli covers 1 Definition. A Vitli cover of set E R is set V of closed intervls with positive length so tht, for every δ > 0 nd every x E, there is some I V with λ(i ) < δ nd x I. 2 Lemm (Vitli covering)
More informationMore on automata. Michael George. March 24 April 7, 2014
More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose
More informationCSE : Exam 3-ANSWERS, Spring 2011 Time: 50 minutes
CSE 260-002: Exm 3-ANSWERS, Spring 20 ime: 50 minutes Nme: his exm hs 4 pges nd 0 prolems totling 00 points. his exm is closed ook nd closed notes.. Wrshll s lgorithm for trnsitive closure computtion is
More informationexpression simply by forming an OR of the ANDs of all input variables for which the output is
2.4 Logic Minimiztion nd Krnugh Mps As we found ove, given truth tle, it is lwys possile to write down correct logic expression simply y forming n OR of the ANDs of ll input vriles for which the output
More information