Direct construction of compact Directed Acyclic Word Graphs
|
|
- Earl Thomas
- 5 years ago
- Views:
Transcription
1 Diret onstrution of ompt Direted Ayli Word Grphs Mxime Crohemore, Renud Vérin To ite this version: Mxime Crohemore, Renud Vérin. Diret onstrution of ompt Direted Ayli Word Grphs. Apostolio A nd Hein J. Combintoril Pttern Mthing (Arhus, 1997), 1997, Frne. Springer- Verlg, 1264, pp , 1997, LNCS. <hl > HAL d: hl Submitted on 13 Feb 2013 HAL is multi-disiplinry open ess rhive for the deposit nd dissemintion of sientifi reserh douments, whether they re published or not. The douments my ome from tehing nd reserh institutions in Frne or brod, or from publi or privte reserh enters. L rhive ouverte pluridisiplinire HAL, est destinée u dépôt et à l diffusion de douments sientifiques de niveu reherhe, publiés ou non, émnnt des étblissements d enseignement et de reherhe frnçis ou étrngers, des lbortoires publis ou privés.
2 Diret onstrution of Compt Direted Ayli Word Grphs Mxime Crohemore nd Renud Verin nstitut Gsprd Monge Universite de Mrne-L-Vllee, 2, rue de l Butte Verte, F Noisy-Le-Grnd. Abstrt. The Direted Ayli Word Grph (DAWG) is n eient dt struture to tret nd nlyze repetitions in text, espeilly in DNA genomi sequenes. Here, we onsider the Compt Direted Ayli Word Grph of word. We give the rst diret lgorithm to onstrut it. t runs in time liner in the length of the string on xed lphbet. Our implementtion requires hlf the memory spe used by DAWGs. Keywords: pttern mthing lgorithm, sux utomton, DAWG, Compt DAWG, sux tree, index on text. 1 ntrodution n the lssil string-mthing problem for word w nd text T, we wnt to know if w ours in T, i.e., if w is ftor of T. n mny pplitions, the sme text is queried severl times. So, eient solutions re bsed on dt strutures built on the text tht serve s n index to look for ny word w in T. The typil running of vrious implementtions of the serh is O(jwj) (on xed lphbet). Among the implementtions, the sux tree ([13]) is the most populr. ts size nd onstrution time re liner in the length of the text. t hs been studied nd used extensively. Apostolio [2] lists over 40 referenes on it, nd Mnber nd Myers [12] mention severl others. Mny vrints hve been developed, like sux rrys [12], PESTry [11], sux tus [10], or sux binry serh trees [9]. Besides, the sux trie, the non-ompt version of the sux tree, hs been rened to the sux utomton (Direted Ayli Word Grph, DAWG). This utomton is good lterntive to represent the whole set of ftors of text. t is the miniml utomton epting this set. t hs been fully exposed by Blumer [3] nd Crohemore [7]. As for the sux tree, its onstrution nd size is liner in the length of the text. n the genome reserh eld, DNA sequenes n be viewed s words over the lphbet f; ; g; tg. They beome subjets for linguisti nd sttisti nlysis. For this purpose, sux utomt re useful dt strutures. ndeed, the struture is fst to ompute nd esy to use. Menwhile, the length of sequenes in dtbses grows rpidly nd the bottlenek to using the bove dt strutures is their size. Keeping the index in min
3 memory is more nd more diult for lrge sequenes. So, hving struture using s little spe s possible is ppreible for its onstrution s well s for its utiliztion. Compression methods re of no use to redue the memory spe of suh indexes beuse they eliminte the diret ess to substrings. On the ontrry, the Compt Direted Ayli Word Grph (CDAWG) keeps the diret ess while requiring less memory spe. The struture hs been introdued by Blumer et l. [4, 5]). The utomton is bsed on the ontention of ftors issued from sme ontext. This ontention indues the deletion of ll sttes of outdegree one nd of their orresponding trnsitions, exepting terminl sttes. This sves 50% of memory spe. At the sme time, the redution of the number of sttes (2=3 less) nd trnsitions (bout hlf less) mkes the pplitions run fster. Both time nd spe re sved. n this pper, we give n lgorithm to build ompt DAWGs. This diret onstrution voids onstruting the DAWG rst, whih mkes it suitble for the tul DNA sequenes (more thn 1:5 million nuleotides for some of them). The ompt DAWG llows to pply stndrd tretment on sequenes twie s long in resonble time ( few minutes). n Setion 2 we rell the bsi notions on DAWGs. Setion 3 introdues the ompt DAWG, lso lled ompt sux utomton, with the bounds on its size. We show in Setion 4 how to build the CDAWG from the DAWG in time liner in the size of this ltter struture. The diret onstrution lgorithm for the CDAWG is given in Setion 5. A onlusion follows. 2 Denitions Let be nonempty lphbet nd the set of words over, with " s the empty word. f w is word in, jwj denotes its length, w i its i th letter, nd w i::j its ftor (subword) w i w i+1 : : : w j. f w = xyz with x; y; z 2, then x, y, nd z denote some ftors or subwords of w, x is prex of w, nd z is sux of w. S(x) denotes the set of ll suxes of x nd F (x) the set of its ftors. For n utomton, the tuple (p; ; q) denotes trnsition of lbel strting t p nd ending t q. A romn letter is used for mono-letter trnsitions, greek letter for multi-letter trnsitions. Moreover, (p; ] denotes trnsition from p for whih is prex of its lbel. Here, we rell the denition of the DAWG, nd theorem bout its implementtion nd its size proved in [3] nd [7]. Denition1. The Sux Automton of word x, denoted DAWG(x), is the miniml deterministi utomton (not neessrily omplete) tht epts S(x), the (nite) set of suxes of x. For exmple, Figure 1 shows the DAWG of the word gtgt. Sttes whih re double irled re terminl sttes. Theorem 2. The size of the DAWG of word x is O(jxj) nd the utomton n be omputed in time O(jxj). The mximum number of sttes of the utomton is 2jxj 1, nd the mximum number of edges is 3jxj 4.
4 t g t g t F g 8 10 Fig. 1. DAWG(gtgt) Rell tht the right ontext of ftor u of x is u 1 S(x). The syntti ongruene, denoted by S(x), ssoited with S(x) is dened, for x; u; v 2, by: u S(x) v () u 1 S(x) = v 1 S(x). We ll lsses of ftors the ongruene lsses of the reltion S(x). The longest word of lss of ftors is lled the representtive of the lss. Sttes of DAWG(x) re extly the lsses of the reltion S(x). Sine this utomton is not required to be omplete, the lss of words not ourring in x, orresponding to the empty right ontext, is not stte of DAWG(x). Moreover, we indue seletion mong the ongruene lsses tht we ll strit lsses of ftors of S(x) nd tht re dened s follows: Denition 3. Let u be word of C, lss of ftors of S(x). f t lest two letters nd b of exist suh tht u nd ub re ftors of x, then we sy tht C is strit lss of ftors of S(x). We lso introdue the funtion endpos x : F (x)! N, dened, for every word u, by: endpos x (u) = minfjwj j w prex of x nd u sux of wg nd the funtion length x dened on sttes of DAWG(x) by : length x (p) = juj; with u representtive of p: The word u lso orresponds to the ontented lbels of trnsitions of the longest pth from the initil stte to p in DAWG(x). The trnsitions tht belong to the spnning tree of longest pths from the initil stte re lled solid trnsitions. Equivlently, for eh trnsition (p; ; q) we hve the property: (p; ; q) is solid () length x (q) = length x (p) + 1: The funtion length x works s well for multi-letter trnsitions, just repling 1 in the bove equivlene by the length of the lbel of the trnsition. This extends the notion of solid trnsitions to multi-letter trnsitions: (p; ; q) is solid () length x (q) = length x (p) + jj: n ddititon, we dene the sux link for stte of DAWG(x) by:
5 Denition4. Let p be stte of DAWG(x), dierent from the initil stte, nd let u word of the equivlene lss p. The sux link of p, denoted by s x (p), is the stte q whih representtive v is the longest sux z of u suh tht u 6 S(x) z. Note tht, onsequently to this denition, we hve length x (q) < length x (p). Then, by itertion, sux links indue sux pths in DAWG(x), whih is n importnt notion used by the onstrution lgorithm. ndeed, s onsequene of the bove inequlity, the sequene (p; s x (p); s 2 x(p); :::) is nite nd ends t the initil stte of DAWG(x). This sequene is lled the sux pth of p. 3 Compt Direted Ayli Word Grphs 3.1 Denition The ompression of DAWGs is bsed on the deletion of some sttes nd their orresponding trnsitions. This is possible using multi-letter trnsitions nd the seletion of strit lsses of ftors dened in the previous setion (Denition 3). Thus, we dene the Compt DAWG s follows. Denition5. The Compt Direted Ayli Word Grph of word x, denoted by CDAWG(x), is the omption of DAWG(x) obtined by keeping only sttes tht re either terminl sttes or strit lsses of ftors ording to S(x), nd by lbeling trnsitions ordingly. Consequently to Denition 3, the strit lsses of ftors orrespond to the sttes tht hve n outdegree greter thn one. So, we n delete every stte hving outdegree one extly, exept terminl sttes. Note tht initil nd nl sttes re terminl sttes too, so they re not deleted. gt 2 gt t F 3 4 gt Fig. 2. CDAWG(gtgt) The onstrution of the DAWG of word inluding some repetitions shows tht mny sttes hve outdegree one only. For exmple, in Figure 1, the DAWG of the word gtgt hs 12 sttes, 7 of whih hve outdegree one; it hs 18 trnsitions. Figure 2 displys the result fter the deletion of these sttes, using multi-letter trnsitions. The resulting utomton hs only 5 sttes nd 11 edges.
6 Aording to experiments to onstrut DAWGs of biologil DNA sequenes, onsidering them s words over the lphbet = f; ; g; tg, we got tht more thn 60% of sttes hve n outdegree one. So, the deletion of these sttes is worth, it provides n importnt sving. The verge nlysis of the number of sttes nd edges is done in [5] in Bernouilly model of probbility. When stte p is deleted, the deletion of outgoing edges is relized by dding the lbel of the outgoing edge of the deleted stte to the lbels of its inoming edges. For exmple, let r, p nd q be sttes linked by trnsitions (r; b; p) nd (p; ; q). We reple the edges (r; b; p) nd (p; ; q) by the edge (r; b; q). By reursion, we extend this method to every multi-letter trnsition (r; ; p). n the exmple (Figure 1), one n note tht, inside the word gtgt, ourrenes of g re followed by t, nd those of t nd gt by. So, gt is the representtive of stte 3 nd it is not neessry to rete sttes for g nd (gt or t). Then, we diretly onnet stte to stte 3 with edges (,gt,3) nd (,t,3). Sttes 1 nd 2 re so deleted. The sux links dened on sttes of DAWGs remin vlid when we redue them to CDAWGs beuse of the next lemm. Lemm 6. f p is stte of CDAWG(x), then sx(p) is stte of CDAWG(x). 3.2 Size bounds By Theorem 2 DAWG(x) is liner in jxj. As we shll see below (Setion 3.3), lbels of multi-letter trnsitions re implemented in onstnt spe. So, the size of CDAWG(x) is lso O(jxj). Menwhile, s we delete mny sttes nd edges, we review the ext bounds on the number of sttes nd edges of CDAWG(x). They re respetively denoted by Sttes(x) nd Edges(x). Corollry 7. Given x 2, if jxj = 0, then Sttes(x) = 1; if jxj = 1, then Sttes(x) = 2; else jxj 2, then 2 Sttes(x) jxj + 1 nd the upper bound is rehed when x is in the form jxj, where 2. Corollry 8. Given x 2, if jxj = 0, Edges(x) = 0; if jxj = 1, Edges(x) = 1; else jxj 2, then Edges(x) 2jxj 2 nd this upper bound is rehed when x is in the form jxj 1, where nd re two dierent letters of. 3.3 mplementtion nd Results Trnsition mtries nd djeny lists re the lssil implementtions of utomt. Their prinipl dierene lies in the implementtion of trnsitions. The rst one gives diret ess to trnsitions, but requires O(Sttes(x) rd()). The seond one stores only the ext number of trnsitions in memory, but needs O(log rd()) time to ess them. When the size of the lphbet is big nd the trnsition mtrix is sprse, djeny lists re preferble. Otherwise, like for genomi sequenes, trnsition mtrix is better hoie, s shown by the
7 experiments below. So, we only onsider here trnsition mtries to implement CDAWGs. We now desribe the ext implementtion of sttes nd edges. We do this on four-letter lphbet, so hrters tke 0:25 byte. We use integers enoded with 4 bytes. For eh stte, to enode the trget stte of outgoing edges, trnsitions mtries need vetor of 4 integers. Adjeny lists need, for eh edge, 2 integers, one for the trget stte nd nother one for the pointer to the next edge. The bsi informtion required to onstrut the DAWG is omposed of tble to implement the funtion sx nd one boolen vlue (0:125 byte) for eh edge to know if it is solid or not. For the CDAWG, in order to implement multiletter trnsitions, we need one integer for the endpos x vlue of eh stte, nd nother integer for the lbel length of eh edge. And tht is ll. ndeed, we n nd the lbel of trnsition by utting o the length of this trnsition from the endpos x vlue of its ending stte. Then, we got the position of the lbel in the soure nd its length. Keeping the soure in memory is negligible onsidering the globl size of the utomton (0:25 byte by hrter). This is quite onvenient solution lso used for sux trees. Figure 3 displys how the Stte Number 0 lengthx endposx 0 0 sx gt t gt gt F 8 9 Fig. 3. Dt Struture of CDAWG(gtgt) sttes of CDAWG(gtgt) re implemented. Then, respetively for trnsitions mtries nd djeny lists, eh stte requires 20:5 nd 17:13 bytes for the DAWG, nd 40:5 nd 41:21 bytes for the CDAWG. As referene, sux trees, s implemented by MCreight [13], need 28:25 nd 20:25 bytes per stte. Moreover, for CDAWG nd sux trees the soure hs to be stored in min memory. Theoretil verge numbers of sttes,
8 lulted by Blumer et l. ([5]), re 0:54n for CDAWG, 1:62n for DAWG, nd 1; 62n for sux trees, when n is the length of x. This gives respetive sizes in bytes per hrter of the soure: 45:68 nd 32:70 for sux trees, 33:26 nd 27:80 for DAWGs, nd 22:40 nd 22:78 for CDAWGs. Considering the omplete dt strutures required for pplitions, the funtion endpos x hs to be dded for the DAWG nd the sux tree. n ddition, the ourrene number of eh ftor hs to be stored in eh stte for ll the strutures. Therefore, the respetive sizes in bytes per hrter of the soure beome : 58:66 nd 45:68 for sux trees, 46:24 nd 40:78 for DAWGs, nd 24:26 nd 24:72 for CDAWGs. Nb sttes Nb trnsitions Nb trnsitions Soure memory jxj jxj Nb sttes jxj x dwg dwg dwg dwg dwg dwg gin hro ,64 0,54 2,54 1,44 1,55 2,66 50,36% oli ,64 0,54 2,54 1,44 1,53 2,66 51,95% bs ,66 0,50 2,50 1,34 1,50 2,66 54,78% bs ,64 0,54 2,54 1,44 1,55 2,66 50,16% rndom ,62 0,55 2,54 1,47 1,57 2,68 49,53% rndom ,62 0,55 2,55 1,47 1,57 2,68 49,35% rndom ,62 0,54 2,54 1,46 1,56 2,68 49,68% rndom ,62 0,54 2,54 1,46 1,56 2,68 49,47% theor. ver. rtios 1,63 0,54 2,54 1,46 1,56 2,67 50,55% Tble 1. Sttisti tble with ount between DAWG nd CDAWG. Moreover, Tble 1 ompres sizes of DAWG nd CDAWG ment for pplitions to DNA sequenes. Sizes for rndom words of dierent lengths nd jj = 4 re lso given. DNA sequenes re Shromyes erevisie yest hromosome (hro ), ontig of Esherihi Coli DNA sequene (oli), nd ontigs 1 nd 115 of Billus Subtilis DNA sequene (bs). Number of sttes nd edges ording to the length of the soure nd the memory spe gin re displyed. Theoretil verge rtios re given, lulted from Blumer et l. ([5]). First, we observe there re 2=3 less sttes in the CDAWG, nd ner of hlf edges. Seond, the memory spe sving is bout 50%. Third, the number of edges by stte is going up to 2:66. With four-letter lphbet, this is interesting beuse the trnsition mtrix beomes smller thn djeny lists. At the sme time, we keep diret ess to trnsitions. 4 Construting CDAWG from DAWG The DAWG onstrution is fully exposed nd demonstrted in [3] nd [7]. As we show in this setion, the CDAWG is esily derived from the DAWG.
9 ndeed, we just need to pply the denition of the CDAWG reursively. This is omputed by the funtion Redution, given below. Observe tht, in this funtion, stte(p; ] denotes the stte pointed to by the trnsition (p; ]. The omputtion is done with depth-rst trversl of the utomton, nd runs in time liner in the number of trnsitions of DAWG(x). Then, by theorem 2, the omputtion lso runs in time liner in the length of the text. However, this method needs to onstrut the DAWG rst, whih spends time nd memory spe proportionl to DAWG(x), though CDAWG(x) is signintly smller. So, it is better to onstrut the CDAWG diretly. Redution (stte E) returns (ending stte, length of redireted edge) 1. f (E not mrked) Then 2. For ll existing edge (E; ] Do 3. (stte(e; ], jlbel((e; ])j) Redution(stte(E; ]); 4. mrk(e) TRUE; 5. f (E is of outdegree one) Then 6. Let (E; ] this edge ; 7. Return (stte(e; ], 1 + jlbel((e; ])j); 8. Else 9. Return (E,1); 5 Diret Constrution of CDAWG n this setion, we give the diret onstrution of CDAWGs nd show tht the running time is liner in the size of the input word x on xed lphbet. 5.1 Algorithm Sine the CDAWG of x is minimiztion of its sux tree, it is rther nturl to bse the diret onstrution on MCreight's lgorithm [13]. Menwhile, properties of the DAWG onstrution re lso used, espeilly sux links (notion tht is dierent from the sux links of MCreight's lgorithm), lengths, nd positions, s explined in the previous setion. First, we introdue the notions used by the lgorithm, some of them re tken from [13]. The lgorithm onstruts the CDAWG of the word x of length n, noted x 0::n 1. The utomton is dened by set of sttes nd trnsitions, espeilly with nd F, the initil nd nl sttes. A prtil pth represents onneted sequene of edges between two sttes of the utomton. A pth is prtil pth tht begins t. The lbel of pth is the ontention of the lbels of orresponding edges. The lous, or ext lous, of string is the end of the pth lbeled by the string. The ontrted lous of string is the lous of the longest prex of whose lous is dened.
10 Preliminry Algorithm Bsilly, the lgorithm to build CDAWG inserts the pths orresponding to ll the suxes of x from the longest to the shortest. We dene suf i s the sux x i::n 1 of x. We denote by A i the utomton onstruted fter the insertion of ll the suf j for 0 j i. bbbb A B bbbb bbbb F 1 F bbbb C 1 D 1 b 2 bbbb bbbb bb bbbb bbb bb F b 2 b bb 3 F bb Fig. 4. Constrution of CDAWG(bbbb) Figure 4 displys four steps of the onstrution of CDAWG(bbbb). n this Figure (nd the followings), the dshed edges represent sux links of sttes, whih re used subsequently. We initilize the utomton A " with sttes nd F. At step i (i > 0), the lgorithm inserts pth orresponding to suf i in A i 1 nd produes A i. The lgorithm stises the following invrint properties: P1: t the beginning of step i, ll suxes suf j, 0 j < i, re pths in A i 1. P2: t the beginning of step i, the sttes of A i 1 re in one-to-one orrespondene with the longest ommon prexes of pirs of suxes longer thn suf j. We dene hed i s the longest prex of suf i whih is lso prex of suf j for some j < i. Equivlently, hed i is the longest prex of suf i whih is lso pth of A i 1. We dene til i s hed 1 suf i i. At step i, the preliminry lgorithm hs to insert til i from the lous of hed i in A i 1 (see Figure 5). To do so, the ontrted lous of hed i in A i 1 is found with the help of funtion SlowFind tht ompres letter-to-letter the right pth of A i 1 to suf i. This is similr to the orresponding MCreight's proedure, exept on wht is explined below. Then, if neessry, new stte is reted to split the lst enountered edge, stte tht is the lous of hed i. The utomton B of Figure 4, displys the retion of stte 1 during the insertion of suf 1 =bbbb. Note tht, if n lredy existing stte mthes the strit lss of ftor of hed i, the lst
11 hed i til i F Fig. 5. Sheme of the insertion of suf i in A i 1. enountered edge is split in the sme wy, but it is redireted to this stte. Suh n exmple ppers in the sme exmple (se D): the insertion of suf 5 =bb indues the rediretion of the edge (2,bbb,F) tht beomes (2,b,3). Then, n edge lbeled by til i is reted from the lous of hed i to F. We n write the preliminry lgorithm s follows: Preliminry Algorithm 1. For ll suf i (i 2[0..n-1]) Do 2. (q; ) SlowFind(); 3. f ( = ") Then 4. insert (q,til i,f); 5. Else 6. rete v lous of hed i splitting (q; ] nd insert (v,til i,f); or rediret (q; ] onto v, the lst reted stte; 7. End For ll; 8. mrk terminl sttes; Note rst tht SlowFind returns the lst enountered stte. This keeps essible the trnsition (q; ] tht n be split if this stte is not n ext lous. Seond, s in the DAWG onstrution, if non-solid edge is enountered during SlowFind, its trget stte hs to be duplited in lone nd the nonsolid edge is redireted to this lone. But, if the lone hs just been reted t the previous step, the edge is redireted to this stte. Note tht, in the two ses, the redireted trnsition beomes solid. Finlly, when til i = " t the end of the onstrution, terminl sttes re mrked long the sux pth of F. From the bove disussion, proof of the invrine of properties P1 nd P2 n be derived. Thus, t the end of the lgorithm ll subwords of x nd only these words re lbels of pths in the utomton (property P1). By property P2, sttes orrespond to strit lsses of ftors (when the longest ommon prex of pir of suxes is not equl to ny of them) or to terminl sttes (when the ontrry holds). This gives sketh of the orretness of the lgorithm.
12 The running time of the preliminry lgorithm is O(jxj 2 ) (with n implementtion by trnsition mtrix), like is the sum of lengths of ll suxes of the word x. Liner Algorithm To get liner-time lgorithm, we use together properties of DAWGs onstrution nd of sux trees onstrution. The min feture is the notion of sux links. They re dened s for DAWGs in Setion 2. They re the lue for the liner-running-time of the lgorithm. Three elements hve to be pointed out bout sux links in the CDAWG. First, we do not need to initilize sux links. ndeed, when suf 0 is inserted, x 0 is obviously new letter, whih diretly indues s x (F)=. Note tht s x () is never used, nd so never dened. Seond, trveling long the sux pth of stte p does not neessrily end t stte. ndeed, with multi-letter trnsitions, if s x (p)= we hve to tret the sux 1 ( 2 ) where is the representtive of p. And third, sux links indue the following invrint property stised t step i: P3: t the beginning of step i, the sux links re dened for eh stte of A i 1 ording to Denition 4. The next remrk llows rediretions without hving to serh with SlowFind for existing sttes belonging to sme lss of ftors. Remrk. Let hve lous p nd ssume tht q = s x (p) is the lous of. Then, p is the lous of suxes of whose lengths re greter thn jj. The lgorithm hs to del with sux links eh time stte is reted. This hppens when stte is duplited, nd when stte is reted fter the exeution of SlowFind. n the duplition, sux links re updted s follows. Let w be the lone of q. n regrd to strit lsses of ftors nd Denition 4, the lss of w is inserted between the ones of q nd s x (q). So, we updte sux links by setting s x (w)=s x (q) nd s x (q)=w. Moreover, the duplition hs the sme properties s in the DAWG onstrution. Let (p; ; q) be the trnsition redireted during the duplition of q. We n rediret ll non-solid edges tht end the prtil pth nd tht strt from stte of the sux pth of p. This is done until the rst edge tht is solid. We re helped in this opertion by the funtion FstFind, similr to the one used in MCreight's lgorithm [13], tht goes through trnsitions just ompring the rst letters of their lbels. This funtion returns the lst enountered stte nd edge. Note tht it is not neessry to nd eh time the prtil pth from sux of p, we just need to tke the sux link of the lst enountered stte nd the lbel of the previous redireted trnsition. Let # be the representtive of stte of the sux pth of p. Observe tht the orresponding rediretion is equivlent to insert suf i+jj j#j. ndeed, ll opertions done fter this rediretion will be the sme s for the insertion of suf i, sine they go through the sme pth.
13 q v sx s r Fig. 6. Sheme of the serh using sux links After the exeution of SlowFind, if stte v is reted, we hve to ompute its sux link. Let be the lbel of the trnsition strting t q nd ending t v. To ompute the sux link, the lgorithm goes through the pth hving lbel from the sux link of q, s = sx(q). The opertion is repeted if neessry. Figure 6 displys sheme of this serh. The thik dshed edges represent pths in the utomton, nd the thin dshed edge represents the sux link of q. This serh will llow to insert, s for the duplition, the suxes suf j, for i < j < i+jhedi j. To trvel long the pth, we use gin the funtion FstFind. Let r nd (r; ] be the lst stte nd trnsition enountered by FstFind. f r is the ext lous of, it is the wnted stte, nd we set then sx(v) = r. Else, if (r; ] is solid edge, then we hve to rete new node w. The edge (r; ] is split, it beomes (r; ; w), nd we insert the trnsition (w,til i,f). Else, (r; ] is non-solid. Then, it is split nd beomes (r; ; v). n the two lst ses, sine sx(v) is not found, we run FstFind gin with sx(r) nd, nd this goes on until sx(v) is eventully found, tht is, when = ". The disussion shows how sux links re updted to insure tht property P3 is stised. The opertions do not inuene the orretness of the lgorithm, skethed in the lst setion, but yield the following liner-time lgorithm. ts time omplexity is disussed in the next setion. Liner Algorithm 1. p ; i 0; 2. While not end of x Do 3. (q; ) SlowFind(p); 4. f ( = ") Then 5. insert (q,tili,f); 6. sx(f) q; 7. f (q 6= ) Then p sx(q) Else p ; 8. Else 9. rete v lous of hedi splitting (q; ]; 10. insert (v,tili,f); 11. sx(f) v; 12. nd r = sx(v) with FstFind; 13. p r; 14. updte i; 15. End While; 16. mrk terminl sttes;
14 5.2 Complexity Theorem 9. The lgorithm tht builds the CDAWG of word x of n be implemented in time O(jxj) nd in spe O(jxj rd()) with trnsition mtrix, or in time O(jxjlog rd()) nd in spe O(jxj) with djeny lists. suf i x hed i til i i j k q v s r Fig. 7. Positions of lbels when suf i is inserted Sketh of the proof t n be proved tht eh step of the lgorithm leds to inrese stritly vribles j or k in the generi sitution displyed in Figure 7. These vribles respetively represent the index of the urrent sux being inserted, nd pointer on the text. These vribles never derese. Therefore, the totl running time of the lgorithm is liner in the length of x. 6 Conlusion We hve onsidered the Compt Diret Ayli Word Grph, whih is n eient ompt dt struture to represent ll suxes of word. There re mny dt strutures representing this set. But, this one llows n interesting spe gin ompred to the well-known DAWG, whih is referene. ndeed, on the one hnd, the upper bounds re of jxj + 1 sttes nd 2jxj 2 trnsitions. This sves jxj sttes nd jxj trnsitions of the DAWG, whih leds to fster utilistion. On the other hnd, experiments on genomi DNA sequenes nd rndom strings disply memory spe gin of 50% ording to the DAWG. Moreover, when the size of the lphbet is smll, trnsition mtries do not tke more spe thn djeny lists, keeping diret ess to trnsitions. Thus, we n onstrut the
15 dt struture of twie lrger strings, keeping them in min memory, whih is tully importnt to get eient tretments. This work shows tht the CDAWG n be onstruted diretly. The lgorithm is liner in the length of the text. Of ourse, it is esier to ompute, by redution, the CDAWG from the DAWG. On the ontrry, our lgorithm sves time nd spe simultneously. Referenes 1. A. Anderson nd S. Nilsson. Eient implementtion of sux trees. Softwre, Prtie nd Experiene, 25(2):129{141, Feb A. Apostolio. The myrid virtues of subword trees. n A. Apostolio & Z. Glil, editor, Combintoril Algorithms on Words., pges 85{95. Springer-Verlg, A. Blumer, J. Blumer, D. Hussler, A. Ehrenfeuht, M.T. Chen, nd J. Seifers. The smllest utomton reognizing the subwords of text. Theoret. Comput. Si., 40:31{55, A. Blumer, J. Blumer, D. Hussler, nd R. MConnell. Complete inverted les for eient text retrievl nd nlysis. Journl of the Assoition for Computing Mhinery, 34(3):578{595, July A. Blumer, D. Hussler, nd A. Ehrenfeuht. Averge sizes of sux trees nd dwgs. Disrete Applied Mthemtis, 24:37{45, B. Clift, D. Hussler, R. MDonnell, T.D. Shneider, nd G.D. Stormo. Sequene lndspes. Nulei Aids Reserh, 4(1):141{158, M. Crohemore. Trnsduers nd repetitions. Theor. Comp. Si., 45:63{86, M. Crohemore nd W. Rytter. Text Algorithms, hpter 5-6, pges 73{130. Oxford University Press, New York, R. W. rving. Sux binry serh trees. Tehnil report TR , Computing Siene Deprtment, University of Glsgow, April J. Krkkinen. Sux tus : ross between sux tree nd sux rry. CPM, 937:191{204, July C. Lefevre nd J-E. ked. The position end-set tree: A smll utomton for word reognition in biologil sequenes. CABOS, 9(3):343{348, U. Mnber nd G. Myers. Sux rrys: A new method for on-line string serhes. SAM J. Comput., 22(5):935{948, Ot E. MCreight. A spe-eonomil sux tree onstrution lgorithm. Journl of the ACM, 23(2):262{272, Apr E. Ukkonen. On-line onstrution of sux trees. Algorithmi, 14:249{260, This rtile ws proessed using the LATEX mro pkge with LLNCS style
Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18
Computt onl Biology Leture 18 Genome Rerrngements Finding preserved genes We hve seen before how to rerrnge genome to obtin nother one bsed on: Reversls Knowledge of preserved bloks (or genes) Now we re
More informationTutorial Worksheet. 1. Find all solutions to the linear system by following the given steps. x + 2y + 3z = 2 2x + 3y + z = 4.
Mth 5 Tutoril Week 1 - Jnury 1 1 Nme Setion Tutoril Worksheet 1. Find ll solutions to the liner system by following the given steps x + y + z = x + y + z = 4. y + z = Step 1. Write down the rgumented mtrix
More informationCS 573 Automata Theory and Formal Languages
Non-determinism Automt Theory nd Forml Lnguges Professor Leslie Lnder Leture # 3 Septemer 6, 2 To hieve our gol, we need the onept of Non-deterministi Finite Automton with -moves (NFA) An NFA is tuple
More informationPrefix-Free Regular-Expression Matching
Prefix-Free Regulr-Expression Mthing Yo-Su Hn, Yjun Wng nd Derik Wood Deprtment of Computer Siene HKUST Prefix-Free Regulr-Expression Mthing p.1/15 Pttern Mthing Given pttern P nd text T, find ll sustrings
More informationTechnische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution
Tehnishe Universität Münhen Winter term 29/ I7 Prof. J. Esprz / J. Křetínský / M. Luttenerger. Ferur 2 Solution Automt nd Forml Lnguges Homework 2 Due 5..29. Exerise 2. Let A e the following finite utomton:
More informationAlgorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points:
Eidgenössishe Tehnishe Hohshule Zürih Eole polytehnique fédérle de Zurih Politenio federle di Zurigo Federl Institute of Tehnology t Zurih Deprtement of Computer Siene. Novemer 0 Mrkus Püshel, Dvid Steurer
More informationCS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6
CS311 Computtionl Strutures Regulr Lnguges nd Regulr Grmmrs Leture 6 1 Wht we know so fr: RLs re losed under produt, union nd * Every RL n e written s RE, nd every RE represents RL Every RL n e reognized
More informationFinite State Automata and Determinisation
Finite Stte Automt nd Deterministion Tim Dworn Jnury, 2016 Lnguges fs nf re df Deterministion 2 Outline 1 Lnguges 2 Finite Stte Automt (fs) 3 Non-deterministi Finite Stte Automt (nf) 4 Regulr Expressions
More informationString Transformation Learning. Baltimore, MD learning problem becomes NP-hard.
String Trnsformtion Lerning Giorgio Stt Diprtimento di Elettroni e Informti Universit di Pdov vi Grdenigo, 6/A I-35131 Pdov, Itly stt@dei.unipd.it John C. Henderson Deprtment of Computer Siene Johns Hopkins
More informationNondeterministic Automata vs Deterministic Automata
Nondeterministi Automt vs Deterministi Automt We lerned tht NFA is onvenient model for showing the reltionships mong regulr grmmrs, FA, nd regulr expressions, nd designing them. However, we know tht n
More information1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides.
1 PYTHAGORAS THEOREM 1 1 Pythgors Theorem In this setion we will present geometri proof of the fmous theorem of Pythgors. Given right ngled tringle, the squre of the hypotenuse is equl to the sum of the
More informationOn-Line Construction of Compact Directed Acyclic Word Graphs
On-Line Constrution of Compt Direte Ayli Wor Grphs Shunsuke neng, Hiroms Hoshino, Ayumi Shinohr, Msyuki Tke,SetsuoArikw, Ginrlo Muri 2, n Giulio Pvesi 2 Dept. of nformtis, Kyushu University, Jpn {s-ine,hoshino,yumi,tke,rikw}@i.kyushu-u..jp
More informationCounting Paths Between Vertices. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs
Isomorphism of Grphs Definition The simple grphs G 1 = (V 1, E 1 ) n G = (V, E ) re isomorphi if there is ijetion (n oneto-one n onto funtion) f from V 1 to V with the property tht n re jent in G 1 if
More informationProject 6: Minigoals Towards Simplifying and Rewriting Expressions
MAT 51 Wldis Projet 6: Minigols Towrds Simplifying nd Rewriting Expressions The distriutive property nd like terms You hve proly lerned in previous lsses out dding like terms ut one prolem with the wy
More information18.06 Problem Set 4 Due Wednesday, Oct. 11, 2006 at 4:00 p.m. in 2-106
8. Problem Set Due Wenesy, Ot., t : p.m. in - Problem Mony / Consier the eight vetors 5, 5, 5,..., () List ll of the one-element, linerly epenent sets forme from these. (b) Wht re the two-element, linerly
More informationLinear Algebra Introduction
Introdution Wht is Liner Alger out? Liner Alger is rnh of mthemtis whih emerged yers k nd ws one of the pioneer rnhes of mthemtis Though, initilly it strted with solving of the simple liner eqution x +
More informationGeneral Suffix Automaton Construction Algorithm and Space Bounds
Generl Suffix Automton Constrution Algorithm nd Spe Bounds Mehryr Mohri,, Pedro Moreno, Eugene Weinstein, Cournt Institute of Mthemtil Sienes 251 Merer Street, New York, NY 10012. Google Reserh 76 Ninth
More informationHyers-Ulam stability of Pielou logistic difference equation
vilble online t wwwisr-publitionsom/jns J Nonliner Si ppl, 0 (207, 35 322 Reserh rtile Journl Homepge: wwwtjnsom - wwwisr-publitionsom/jns Hyers-Ulm stbility of Pielou logisti differene eqution Soon-Mo
More information22: Union Find. CS 473u - Algorithms - Spring April 14, We want to maintain a collection of sets, under the operations of:
22: Union Fin CS 473u - Algorithms - Spring 2005 April 14, 2005 1 Union-Fin We wnt to mintin olletion of sets, uner the opertions of: 1. MkeSet(x) - rete set tht ontins the single element x. 2. Fin(x)
More informationDiscrete Structures Lecture 11
Introdution Good morning. In this setion we study funtions. A funtion is mpping from one set to nother set or, perhps, from one set to itself. We study the properties of funtions. A mpping my not e funtion.
More informationLinear choosability of graphs
Liner hoosility of grphs Louis Esperet, Mikel Montssier, André Rspud To ite this version: Louis Esperet, Mikel Montssier, André Rspud. Liner hoosility of grphs. Stefn Felsner. 2005 Europen Conferene on
More informationA Lower Bound for the Length of a Partial Transversal in a Latin Square, Revised Version
A Lower Bound for the Length of Prtil Trnsversl in Ltin Squre, Revised Version Pooy Htmi nd Peter W. Shor Deprtment of Mthemtil Sienes, Shrif University of Tehnology, P.O.Bo 11365-9415, Tehrn, Irn Deprtment
More information, g. Exercise 1. Generator polynomials of a convolutional code, given in binary form, are g. Solution 1.
Exerise Genertor polynomils of onvolutionl ode, given in binry form, re g, g j g. ) Sketh the enoding iruit. b) Sketh the stte digrm. ) Find the trnsfer funtion T. d) Wht is the minimum free distne of
More information2.4 Theoretical Foundations
2 Progrmming Lnguge Syntx 2.4 Theoretil Fountions As note in the min text, snners n prsers re se on the finite utomt n pushown utomt tht form the ottom two levels of the Chomsky lnguge hierrhy. At eh level
More informationPart 4. Integration (with Proofs)
Prt 4. Integrtion (with Proofs) 4.1 Definition Definition A prtition P of [, b] is finite set of points {x 0, x 1,..., x n } with = x 0 < x 1
More informationHybrid Systems Modeling, Analysis and Control
Hyrid Systems Modeling, Anlysis nd Control Rdu Grosu Vienn University of Tehnology Leture 5 Finite Automt s Liner Systems Oservility, Rehility nd More Miniml DFA re Not Miniml NFA (Arnold, Diky nd Nivt
More informationLecture Notes No. 10
2.6 System Identifition, Estimtion, nd Lerning Leture otes o. Mrh 3, 26 6 Model Struture of Liner ime Invrint Systems 6. Model Struture In representing dynmil system, the first step is to find n pproprite
More informationIntermediate Math Circles Wednesday 17 October 2012 Geometry II: Side Lengths
Intermedite Mth Cirles Wednesdy 17 Otoer 01 Geometry II: Side Lengths Lst week we disussed vrious ngle properties. As we progressed through the evening, we proved mny results. This week, we will look t
More informationThe University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS
The University of ottinghm SCHOOL OF COMPUTR SCIC A LVL 2 MODUL, SPRIG SMSTR 2015 2016 MACHIS AD THIR LAGUAGS ASWRS Time llowed TWO hours Cndidtes my omplete the front over of their nswer ook nd sign their
More information= state, a = reading and q j
4 Finite Automt CHAPTER 2 Finite Automt (FA) (i) Derterministi Finite Automt (DFA) A DFA, M Q, q,, F, Where, Q = set of sttes (finite) q Q = the strt/initil stte = input lphet (finite) (use only those
More information(a) A partition P of [a, b] is a finite subset of [a, b] containing a and b. If Q is another partition and P Q, then Q is a refinement of P.
Chpter 7: The Riemnn Integrl When the derivtive is introdued, it is not hrd to see tht the it of the differene quotient should be equl to the slope of the tngent line, or when the horizontl xis is time
More informationNON-DETERMINISTIC FSA
Tw o types of non-determinism: NON-DETERMINISTIC FS () Multiple strt-sttes; strt-sttes S Q. The lnguge L(M) ={x:x tkes M from some strt-stte to some finl-stte nd ll of x is proessed}. The string x = is
More informationp-adic Egyptian Fractions
p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction
More informationINTEGRATION. 1 Integrals of Complex Valued functions of a REAL variable
INTEGRATION NOTE: These notes re supposed to supplement Chpter 4 of the online textbook. 1 Integrls of Complex Vlued funtions of REAL vrible If I is n intervl in R (for exmple I = [, b] or I = (, b)) nd
More informationMore Properties of the Riemann Integral
More Properties of the Riemnn Integrl Jmes K. Peterson Deprtment of Biologil Sienes nd Deprtment of Mthemtil Sienes Clemson University Februry 15, 2018 Outline More Riemnn Integrl Properties The Fundmentl
More informationI1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3
2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is
More informationDiscrete Structures, Test 2 Monday, March 28, 2016 SOLUTIONS, VERSION α
Disrete Strutures, Test 2 Mondy, Mrh 28, 2016 SOLUTIONS, VERSION α α 1. (18 pts) Short nswer. Put your nswer in the ox. No prtil redit. () Consider the reltion R on {,,, d with mtrix digrph of R.. Drw
More informationComparing the Pre-image and Image of a Dilation
hpter Summry Key Terms Postultes nd Theorems similr tringles (.1) inluded ngle (.2) inluded side (.2) geometri men (.) indiret mesurement (.6) ngle-ngle Similrity Theorem (.2) Side-Side-Side Similrity
More informationIntroduction to Olympiad Inequalities
Introdution to Olympid Inequlities Edutionl Studies Progrm HSSP Msshusetts Institute of Tehnology Snj Simonovikj Spring 207 Contents Wrm up nd Am-Gm inequlity 2. Elementry inequlities......................
More informationCMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)
CMSC 330: Orgniztion of Progrmming Lnguges DFAs, nd NFAs, nd Regexps (Oh my!) CMSC330 Spring 2018 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All
More information] dx (3) = [15x] 2 0
Leture 6. Double Integrls nd Volume on etngle Welome to Cl IV!!!! These notes re designed to be redble nd desribe the w I will eplin the mteril in lss. Hopefull the re thorough, but it s good ide to hve
More informationCS375: Logic and Theory of Computing
CS375: Logic nd Theory of Computing Fuhu (Frnk) Cheng Deprtment of Computer Science University of Kentucky 1 Tble of Contents: Week 1: Preliminries (set lgebr, reltions, functions) (red Chpters 1-4) Weeks
More informationCommon intervals of genomes. Mathieu Raffinot CNRS LIAFA
Common intervls of genomes Mthieu Rffinot CNRS LIF Context: omprtive genomis. set of genomes prtilly/totlly nnotte Informtive group of genes or omins? Ex: COG tse Mny iffiulties! iology Wht re two similr
More informationLine Integrals and Entire Functions
Line Integrls nd Entire Funtions Defining n Integrl for omplex Vlued Funtions In the following setions, our min gol is to show tht every entire funtion n be represented s n everywhere onvergent power series
More informationWhere did dynamic programming come from?
Where did dynmic progrmming come from? String lgorithms Dvid Kuchk cs302 Spring 2012 Richrd ellmn On the irth of Dynmic Progrmming Sturt Dreyfus http://www.eng.tu.c.il/~mi/cd/ or50/1526-5463-2002-50-01-0048.pdf
More informationarxiv: v1 [math.ca] 21 Aug 2018
rxiv:1808.07159v1 [mth.ca] 1 Aug 018 Clulus on Dul Rel Numbers Keqin Liu Deprtment of Mthemtis The University of British Columbi Vnouver, BC Cnd, V6T 1Z Augest, 018 Abstrt We present the bsi theory of
More informationChapter 3. Vector Spaces. 3.1 Images and Image Arithmetic
Chpter 3 Vetor Spes In Chpter 2, we sw tht the set of imges possessed numer of onvenient properties. It turns out tht ny set tht possesses similr onvenient properties n e nlyzed in similr wy. In liner
More informationChapter 4 State-Space Planning
Leture slides for Automted Plnning: Theory nd Prtie Chpter 4 Stte-Spe Plnning Dn S. Nu CMSC 722, AI Plnning University of Mrylnd, Spring 2008 1 Motivtion Nerly ll plnning proedures re serh proedures Different
More information8 THREE PHASE A.C. CIRCUITS
8 THREE PHSE.. IRUITS The signls in hpter 7 were sinusoidl lternting voltges nd urrents of the so-lled single se type. n emf of suh type n e esily generted y rotting single loop of ondutor (or single winding),
More informationNon Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51
Non Deterministic Automt Linz: Nondeterministic Finite Accepters, pge 51 1 Nondeterministic Finite Accepter (NFA) Alphbet ={} q 1 q2 q 0 q 3 2 Nondeterministic Finite Accepter (NFA) Alphbet ={} Two choices
More informationElectromagnetism Notes, NYU Spring 2018
Eletromgnetism Notes, NYU Spring 208 April 2, 208 Ation formultion of EM. Free field desription Let us first onsider the free EM field, i.e. in the bsene of ny hrges or urrents. To tret this s mehnil system
More informationAlgorithm Design and Analysis
Algorithm Design nd Anlysis LECTURE 5 Supplement Greedy Algorithms Cont d Minimizing lteness Ching (NOT overed in leture) Adm Smith 9/8/10 A. Smith; sed on slides y E. Demine, C. Leiserson, S. Rskhodnikov,
More informationBisimulation, Games & Hennessy Milner logic
Bisimultion, Gmes & Hennessy Milner logi Leture 1 of Modelli Mtemtii dei Proessi Conorrenti Pweł Soboiński Univeristy of Southmpton, UK Bisimultion, Gmes & Hennessy Milner logi p.1/32 Clssil lnguge theory
More informationAlgorithm Design and Analysis
Algorithm Design nd Anlysis LECTURE 8 Mx. lteness ont d Optiml Ching Adm Smith 9/12/2008 A. Smith; sed on slides y E. Demine, C. Leiserson, S. Rskhodnikov, K. Wyne Sheduling to Minimizing Lteness Minimizing
More informationA Study on the Properties of Rational Triangles
Interntionl Journl of Mthemtis Reserh. ISSN 0976-5840 Volume 6, Numer (04), pp. 8-9 Interntionl Reserh Pulition House http://www.irphouse.om Study on the Properties of Rtionl Tringles M. Q. lm, M.R. Hssn
More informationConvert the NFA into DFA
Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:
More informationNecessary and sucient conditions for some two. Abstract. Further we show that the necessary conditions for the existence of an OD(44 s 1 s 2 )
Neessry n suient onitions for some two vrile orthogonl esigns in orer 44 C. Koukouvinos, M. Mitrouli y, n Jennifer Seerry z Deite to Professor Anne Penfol Street Astrt We give new lgorithm whih llows us
More informationMetodologie di progetto HW Technology Mapping. Last update: 19/03/09
Metodologie di progetto HW Tehnology Mpping Lst updte: 19/03/09 Tehnology Mpping 2 Tehnology Mpping Exmple: t 1 = + b; t 2 = d + e; t 3 = b + d; t 4 = t 1 t 2 + fg; t 5 = t 4 h + t 2 t 3 ; F = t 5 ; t
More informationCompression of Palindromes and Regularity.
Compression of Plinromes n Regulrity. Kyoko Shikishim-Tsuji Center for Lierl Arts Eution n Reserh Tenri University 1 Introution In [1], property of likstrem t t view of tse is isusse n it is shown tht
More informationFinite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018
Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More
More informationMinimal DFA. minimal DFA for L starting from any other
Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA
More informationRegular languages refresher
Regulr lnguges refresher 1 Regulr lnguges refresher Forml lnguges Alphet = finite set of letters Word = sequene of letter Lnguge = set of words Regulr lnguges defined equivlently y Regulr expressions Finite-stte
More informationState space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies
Stte spce systems nlysis (continued) Stbility A. Definitions A system is sid to be Asymptoticlly Stble (AS) when it stisfies ut () = 0, t > 0 lim xt () 0. t A system is AS if nd only if the impulse response
More informationChapter 2 Finite Automata
Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht
More informationNondeterministic Finite Automata
Nondeterministi Finite utomt The Power of Guessing Tuesdy, Otoer 4, 2 Reding: Sipser.2 (first prt); Stoughton 3.3 3.5 S235 Lnguges nd utomt eprtment of omputer Siene Wellesley ollege Finite utomton (F)
More information1 Online Learning and Regret Minimization
2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in
More informationCo-ordinated s-convex Function in the First Sense with Some Hadamard-Type Inequalities
Int. J. Contemp. Mth. Sienes, Vol. 3, 008, no. 3, 557-567 Co-ordinted s-convex Funtion in the First Sense with Some Hdmrd-Type Inequlities Mohmmd Alomri nd Mslin Drus Shool o Mthemtil Sienes Fulty o Siene
More information6.5 Improper integrals
Eerpt from "Clulus" 3 AoPS In. www.rtofprolemsolving.om 6.5. IMPROPER INTEGRALS 6.5 Improper integrls As we ve seen, we use the definite integrl R f to ompute the re of the region under the grph of y =
More informationMath 32B Discussion Session Week 8 Notes February 28 and March 2, f(b) f(a) = f (t)dt (1)
Green s Theorem Mth 3B isussion Session Week 8 Notes Februry 8 nd Mrh, 7 Very shortly fter you lerned how to integrte single-vrible funtions, you lerned the Fundmentl Theorem of lulus the wy most integrtion
More informationLecture 6: Coding theory
Leture 6: Coing theory Biology 429 Crl Bergstrom Ferury 4, 2008 Soures: This leture loosely follows Cover n Thoms Chpter 5 n Yeung Chpter 3. As usul, some of the text n equtions re tken iretly from those
More informationLecture Summaries for Multivariable Integral Calculus M52B
These leture summries my lso be viewed online by liking the L ion t the top right of ny leture sreen. Leture Summries for Multivrible Integrl Clulus M52B Chpter nd setion numbers refer to the 6th edition.
More informationLecture 09: Myhill-Nerode Theorem
CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives
More informationTypes of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt
More informationCompiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz
University of Southern Cliforni Computer Siene Deprtment Compiler Design Spring 7 Lexil Anlysis Smple Exerises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sienes Institute 47 Admirlty Wy, Suite
More informationFast index for approximate string matching
Fst index for pproximte string mthing Dekel Tsur Astrt We present n index tht stores text of length n suh tht given pttern of length m, ll the sustrings of the text tht re within Hmming distne (or edit
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic
More informationA Mathematical Model for Unemployment-Taking an Action without Delay
Advnes in Dynmil Systems nd Applitions. ISSN 973-53 Volume Number (7) pp. -8 Reserh Indi Publitions http://www.ripublition.om A Mthemtil Model for Unemployment-Tking n Ation without Dely Gulbnu Pthn Diretorte
More informationMagnetically Coupled Coil
Mgnetilly Coupled Ciruits Overview Mutul Indutne Energy in Coupled Coils Liner Trnsformers Idel Trnsformers Portlnd Stte University ECE 22 Mgnetilly Coupled Ciruits Ver..3 Mgnetilly Coupled Coil i v L
More informationAUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton
25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q
More informationANALYSIS AND MODELLING OF RAINFALL EVENTS
Proeedings of the 14 th Interntionl Conferene on Environmentl Siene nd Tehnology Athens, Greee, 3-5 Septemer 215 ANALYSIS AND MODELLING OF RAINFALL EVENTS IOANNIDIS K., KARAGRIGORIOU A. nd LEKKAS D.F.
More informationEngr354: Digital Logic Circuits
Engr354: Digitl Logi Ciruits Chpter 4: Logi Optimiztion Curtis Nelson Logi Optimiztion In hpter 4 you will lern out: Synthesis of logi funtions; Anlysis of logi iruits; Tehniques for deriving minimum-ost
More informationLIP. Laboratoire de l Informatique du Parallélisme. Ecole Normale Supérieure de Lyon
LIP Lortoire de l Informtique du Prllélisme Eole Normle Supérieure de Lyon Institut IMAG Unité de reherhe ssoiée u CNRS n 1398 One-wy Cellulr Automt on Cyley Grphs Zsuzsnn Rok Mrs 1993 Reserh Report N
More informationCS 491G Combinatorial Optimization Lecture Notes
CS 491G Comintoril Optimiztion Leture Notes Dvi Owen July 30, August 1 1 Mthings Figure 1: two possile mthings in simple grph. Definition 1 Given grph G = V, E, mthing is olletion of eges M suh tht e i,
More informationTypes of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt
More informationAnatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute
Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn
More informationSection 1.3 Triangles
Se 1.3 Tringles 21 Setion 1.3 Tringles LELING TRINGLE The line segments tht form tringle re lled the sides of the tringle. Eh pir of sides forms n ngle, lled n interior ngle, nd eh tringle hs three interior
More informationAP Calculus BC Chapter 8: Integration Techniques, L Hopital s Rule and Improper Integrals
AP Clulus BC Chpter 8: Integrtion Tehniques, L Hopitl s Rule nd Improper Integrls 8. Bsi Integrtion Rules In this setion we will review vrious integrtion strtegies. Strtegies: I. Seprte the integrnd into
More informationData Structures and Algorithm. Xiaoqing Zheng
Dt Strutures nd Algorithm Xioqing Zheng zhengxq@fudn.edu.n String mthing prolem Pttern P ours with shift s in text T (or, equivlently, tht pttern P ours eginning t position s + in text T) if T[s +... s
More informationFormal Languages and Automata
Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University
More informationLearning Partially Observable Markov Models from First Passage Times
Lerning Prtilly Oservle Mrkov s from First Pssge s Jérôme Cllut nd Pierre Dupont Europen Conferene on Mhine Lerning (ECML) 8 Septemer 7 Outline. FPT in models nd sequenes. Prtilly Oservle Mrkov s (POMMs).
More informationChem Homework 11 due Monday, Apr. 28, 2014, 2 PM
Chem 44 - Homework due ondy, pr. 8, 4, P.. . Put this in eq 8.4 terms: E m = m h /m e L for L=d The degenery in the ring system nd the inresed sping per level (4x bigger) mkes the sping between the HOO
More informationHow to simulate Turing machines by invertible one-dimensional cellular automata
How to simulte Turing mchines by invertible one-dimensionl cellulr utomt Jen-Christophe Dubcq Déprtement de Mthémtiques et d Informtique, École Normle Supérieure de Lyon, 46, llée d Itlie, 69364 Lyon Cedex
More informationCS 275 Automata and Formal Language Theory
CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.6.: Push Down Automt Remrk: This mteril is no longer tught nd not directly exm relevnt Anton Setzer (Bsed
More informationMid-Term Examination - Spring 2014 Mathematical Programming with Applications to Economics Total Score: 45; Time: 3 hours
Mi-Term Exmintion - Spring 0 Mthemtil Progrmming with Applitions to Eonomis Totl Sore: 5; Time: hours. Let G = (N, E) e irete grph. Define the inegree of vertex i N s the numer of eges tht re oming into
More informationGeneralization of 2-Corner Frequency Source Models Used in SMSIM
Generliztion o 2-Corner Frequeny Soure Models Used in SMSIM Dvid M. Boore 26 Mrh 213, orreted Figure 1 nd 2 legends on 5 April 213, dditionl smll orretions on 29 My 213 Mny o the soure spetr models ville
More informationLossless Compression Lossy Compression
Administrivi CSE 39 Introdution to Dt Compression Spring 23 Leture : Introdution to Dt Compression Entropy Prefix Codes Instrutor Prof. Alexnder Mohr mohr@s.sunys.edu offie hours: TBA We http://mnl.s.sunys.edu/lss/se39/24-fll/
More informationLecture 1 - Introduction and Basic Facts about PDEs
* 18.15 - Introdution to PDEs, Fll 004 Prof. Gigliol Stffilni Leture 1 - Introdution nd Bsi Fts bout PDEs The Content of the Course Definition of Prtil Differentil Eqution (PDE) Liner PDEs VVVVVVVVVVVVVVVVVVVV
More informationTrigonometry and Constructive Geometry
Trigonometry nd Construtive Geometry Trining prolems for M2 2018 term 1 Ted Szylowie tedszy@gmil.om 1 Leling geometril figures 1. Prtie writing Greek letters. αβγδɛθλµπψ 2. Lel the sides, ngles nd verties
More informationRegular expressions, Finite Automata, transition graphs are all the same!!
CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1
More informationScientific notation is a way of expressing really big numbers or really small numbers.
Scientific Nottion (Stndrd form) Scientific nottion is wy of expressing relly big numbers or relly smll numbers. It is most often used in scientific clcultions where the nlysis must be very precise. Scientific
More information