SPIN: Mining Maximal Frequent Subgraphs from Graph Databases

Size: px
Start display at page:

Download "SPIN: Mining Maximal Frequent Subgraphs from Graph Databases"

Transcription

1 Reserch Trck Poster : Mining Miml Frequent Sugrphs from Grph Dtses Jun Hun, Wei Wng, Jn Prins Deprtment of Computer Science, Universit of North Crolin t Chpel Hill Chpel Hill,NC 27599, USA {hun, weiwng, prins}@cs.unc.edu Jiong Yng Deprtment of Computer Science, Universit of Illinois Urn-Chmpign, IL 61801, USA jiong@cs.uiuc.edu ABSTRACT One fundmentl chllenge for mining recurring sugrphs from semi-structured dt sets is the overwhelming undnce of such ptterns. In lrge grph dtses, the totl numer of frequent sugrphs cn ecome too lrge to llow full enumertion using resonle computtionl resources. In this pper, we propose new lgorithm tht mines onl miml frequent sugrphs, i.e. sugrphs tht re not prt of n other frequent sugrphs. This m eponentill decrese the size of the output set in the est cse; in our eperiments on prcticl dt sets, mining miml frequent sugrphs reduces the totl numer of mined ptterns two to three orders of mgnitude. Our method first mines ll frequent trees from generl grph dtse nd then reconstructs ll miml sugrphs from the mined trees. Using two chemicl structure enchmrks nd set of snthetic grph dt sets, we demonstrte tht, in ddition to decresing the output size, our lgorithm cn chieve five-fold speed up over the current stte-of-the-rt sugrph mining lgorithms. Ctegories nd Suject Descriptors: H.2.8 [Dtse Applictions]: Dt Mining Generl Terms: Algorithms Kewords: Sugrph Mining, Spnning Tree 1. INTRODUCTION In this pper, we focus on the prolem of finding recurring sugrphs from grph dtses, which is ver ctive topic in current dt mining reserch. Grphs provide generl w to model vriet of reltions mong dt, hence finding recurring sugrphs hs mn pplictions in interdisciplinr reserch such s chemicl informtics [2] nd ioinformtics [11]. There re lso mn pplictions in dt mngement reserch such s efficient storge of semi-structured dtses [5], efficient indeing [21], nd we informtion mngement [16]. One performnce issue (mong mn others) in mining lrge grph dtses is the huge numer of recurring ptterns. The phe- Permission to mke digitl or hrd copies of ll or prt of this work for personl or clssroom use is grnted without fee provided tht copies re not mde or distriuted for profit or commercil dvntge nd tht copies er this notice nd the full cittion on the first pge. To cop otherwise, to repulish, to post on servers or to redistriute to lists, requires prior specific permission nd/or fee. KDD 04, August 22 25, 2004, Settle, Wshington, USA. Copright 2004 ACM /04/ $5.00. nomenon is well understood in mining long frequent itemsets. Given frequent itemset I, n suset of I is lso frequent hence the numer of such frequent itemsets grows eponentill with I. In this pper, we propose new grph mining lgorithm tht mines onl miml frequent sugrphs. Given set of grphs G (referred to s grph dtse), the support of grph G is defined s the frction of grphs in G in which G occurs [9, 20]. G is frequent if its support is t lest user specified threshold; frequent sugrph is miml if none of its super grphs re frequent [10]. Mining onl miml frequent sugrph offers the following dvntges in processing lrge grph dtses. (1) It significntl reduces the totl numer of mined sugrphs. In eperiments we performed on some relistic dt sets, the totl numer of frequent sugrphs is up to one thousnd times greter thn the numer of miml sugrphs. We cn sve oth spce nd susequent nlsis effort if the numer of mined sugrphs is significntl reduced. (2) Severl pruning techniques, which re detiled in this pper, cn e efficientl integrted into the mining process nd drmticll reduce the totl mining time. (3) The non-miml frequent sugrphs cn e reconstructed from the miml sugrphs reported. To get the ctul frequenc (support) of non-miml sugrphs requires emintion of the originl dtse, ut it is certin to e t lest s high s the frequenc of the miml sugrph. In ddition, the techniques used in [15] cn e esil dpted to pproimte the support of ll frequent sugrphs within some error ound. (4) In some pplictions such s discovering structure motifs in group of homolog proteins [7, 11], miml frequent sugrphs re the sugrphs of most interest since the encode the miml structure commonlities within the group. Our mining method is sed on novel grph mining frmework in which we first mine ll frequent tree ptterns from grph dtse nd then construct miml frequent sugrphs from trees. This pproch offers smptotic dvntges compred to using sugrphs s uilding locks, since tree normliztion is simpler prolem thn grph normliztion. The proposed method enles us to integrte well-developed techniques from mining miml itemsets nd knowledge gined in grph mining into new lgorithm. According to our eperimentl stud, such comintion cn offer significnt performnce speedup in oth snthetic nd rel dt sets. The frmework of our method is verstile. Deping on the prticulr tree mining lgorithm, the serch cn e either redth-first or depth-first (preferred due to its etter memor utiliztion). It cn lso e designed to mine ll frequent sugrphs without mjor modifictions. Technicll, we mke three contriutions: (1) we propose novel lgorithm (SPnning tree sed miml grph mining) to mine onl miml frequent sugrphs of lrge grph dtses, 581

2 Reserch Trck Poster (2) we integrte severl optimiztion techniques, some from eisting miml itemset mining reserch nd some developed ourselves, to speed up the mining process, (3) we perform n etensive nlsis of the proposed lgorithm nd nlze how its performnce on grph dt sets with different chrcteristics. The reminder of the pper is orgnized s follows. In Section 2, we present the dt structure nd the proposed lgorithm. Section 3 presents the results of our eperimentl stud using snthetic grph dtses nd two enchmrk chemicl dt sets. We conclude the pper with discussion, relted works, nd conclusion. 2. MAXIMAL SUBGRAPH MINING In the following discussion, we present novel frmework for mining miml frequent sugrphs from grph dtse. The frmework comines tree mining nd sugrph mining; we first find ll frequent trees from grph dtse nd then reconstruct the group of frequent sugrphs from the mined trees. There re two importnt components in the frmework. The first is grph prtitioning method through which we group ll frequent sugrphs into equivlence clsses sed on the spnning trees the contin. The second importnt component is set of pruning techniques which im to remove some prtitions entirel or prtill for the purpose of finding miml frequent ones onl. There re three resons we dvocte this two-step method for finding miml grph ptterns. First, tree relted opertions such s isomorphism, normliztion, nd testing whether tree is sutree of nother tree re smptoticll simpler thn the comprle opertions for grphs, which re NP-complete. Second, in certin pplictions, such s chemicl compound nlsis, most of the frequent sugrphs re rell trees. Lst ut not lest, this frmework dpts well to miml frequent sugrph mining, which is the focus of this pper. Using chemicl structure enchmrk, we show 99% of cclic grph ptterns nd 60% of tree ptterns cn e eliminted our optimiztion technique in serching for miml sugrphs; further detils out the efficienc of the optimiztion techniques cn e found in [10]. To the est of our knowledge, we re the first to comine the two distinct methodologies: mining frequent sugrphs in grph dtses nd mining frequent trees in forests ( set of trees) for the purpose of designing efficient sugrph mining lgorithm. 2.1 Tree-sed Equivlence Clsses We define sutree of n undirected grph G s n cclic connected sugrph of G. A sutree T is spnning tree of G if T contins ll nodes in G. Given grph G, there re mn spnning trees nd we define the miml one, ccording to totl order defined on trees [4, 10], nd cll it the cnonicl spnning tree of G. EXAMPLE 2.1. In Figure 1, we show n emple of leled grph P (upper-left) with ll four-node sutrees of P. Ech sutree is represented its cnonicl representtion nd sorted ccording to the totl order (s given in [10]). Ech such tree is spnning tree of the grph P nd the first one (T 1) is the cnonicl spnning tree of P. DEFINITION 2.1. Tree-sed Equivlence Clsses: Given two grphs P nd Q, we defined inr reltion = such tht P = Q if nd onl if their cnonicl spnning trees re isomorphic to ech other. The reltion = is refleive, smmetric, trnsitive, nd hence n equivlence reltion. p 2 p 1 p 3 (P) p 4 (T 1 ) (T 2 ) (T 3 ) (T 4 ) (T 5 ) (T 6 ) (T 7 ) Figure 1: Emple of leled grph P (upper-left), P s sutrees, spnning trees, nd its cnonicl spnning tree (T 1 ). EXAMPLE 2.2. In Figure 2, we show sugrphs of the grph P in Figure 1 which re not necessril trees. Sugrphs re grouped together if the shre the sme cnonicl spnning tree. The five non-singleton groups re shown here nd the remining twelve groups re ll singletons 1 Clss I Clss II Clss IV Clss III Clss V Figure 2: Emple of tree sed equivlence clsses for sugrphs in grph P, presented in Figure 1. We cn use simple greed serch lgorithm to find the cnonicl spnning tree of grph G, the detils of which re given in [10]. The frequent sugrph mining cn conceptull e roken into two steps: (1) mine ll the frequent trees from grph dtse nd (2) for ech such frequent tree T, find ll frequent sugrphs whose cnonicl spnning trees re isomorphic to T. Miml frequent sugrphs cn e found mong frequent ones. We skip the first step in the following discussion for two resons. First, s pointed out in [20], the current sugrph mining lgorithms cn e esil tilored to find onl trees from grph dtse limiting the topolog of the ptterns. This is true for Closegrph [20] s well s for FFSM [9], which is our recentl developed depth-first sugrph mining lgorithm. Second, most of the techniques developed for mining sutrees from forest cn lso e esil dpted 1 Throughout the pper, we re interested onl in sugrphs with t lest n edge (i.e. ecluding frequent nodes s trivil cses). 582

3 Reserch Trck Poster K S1 K S2 K 2 K 1 K 3 K Clss I C= {(k 2, k 3, ), (k 3, k 4, )} K 4 K S3 Before we proceed to detils out mining miml frequent sugrphs, we outline the enumertion scheme discussed so fr in Tle 1 nd Tle 2. Our strteg is quite strightforwrd: we first find ll frequent trees; trees re epnded to cclic grphs serching their serch spces; nd miml frequent sugrphs re constructed from frequent ones. We notice tht this lgorithm is correct, which mens we re gurnteed to find ll miml frequent sugrphs. However, it is not efficient in tht we still need to enumerte ll frequent sugrphs to construct miml ones. In the net section, we introduce optimiztion techniques to improve the serch for miml frequent sugrphs. Figure 3: Emple of enumerting grph s serch spce. We use dshed lines on the sugrph K S1 nd K S3 to denote the fct tht the will e pruned n optimiztion technique which is discussed in Section for the sme purpose. Therefore, in the following discussion, we focus on step 2, which is how to enumerte the equivlence clss of tree T nd how miml sugrph mining is relted to this enumertion. We wnt to point out tht the two-step division of the mining lgorithm is rtificil ut it mkes it es to eplin the ke ides of the lgorithm without introducing too mn detils. In our longer version of this pper [10], we discuss full optimized lgorithm which (1) uses modified FFSM lgorithm to enumerte trees from grph dtses nd (2) integrtes tree discover nd miml pttern mining for miml performnce. 2.2 Enumerting Grphs from Trees We first outline sic enumertion scheme to serch the equivlence clss of tree. We define joining opertion etween grph(tree) G nd hpotheticl edge connecting n two nodes i, j in G with lel e l such tht G (i, j, e l )=G where G is supergrph of G with one dditionl edge etween nodes i nd j with lel e l. If the grph G lred contins n edge etween nodes i nd j, the joining opertion fils nd produces nothing. If G is frequent, we denote the hpotheticl edge (i, j, e l ) s cndidte edge for G. The ove definition cn serve s the sis for recursive definition of the joining opertion etween grph G nd cndidte edge set E = {e 1,e 2,...,e n} such tht G E = (G e 1) {e 2,...,e n}. Let s ssume we lred clculted the set of cndidte edges C = {c 1,c 2,...,c n} from the set of ll possile frequent hpotheticl edges. We define the serch spce of G, denoted G : C, s the set of grphs which might e produced joining the grph G nd cndidte edge set in the powerset set of C (denoted 2 C ). Tht is: G : C = {G 2 C } (1) In the following discussion, the group of cndidte edges re sometimes referred to s the til of the grph G in its serch spce. We present recursive lgorithm in Tle 2 to enumerte the serch spce for grph G. The procedure we use to clculte the set of cndidte edges for tree pttern cn e found in [10]. EXAMPLE 2.3. in Figure 3, we single out the lrgest equivlence clss (Clss One) from Figure 2. We show tree K together with its til C = {(k 2,k 3,), (k 3,k 4,)}. K s serch spce is hence composed of four grphs {K, K S1,K S2,K S3} (K is lws included in its serch spce) nd is orgnized into serch tree in nlog to frequent item set mining. This tree structure follows the recursive procedure we present in Tle 2. Algorithm Miml Sugrph Mining(G,σ) egin 1. R {T T is frequent tree in G} 2. S {G G Epnsion(T ) nd T R} 3. return {G G S nd G is miml } Tle 1: An outline of the miml sugrph mining lgorithm Algorithm Epnsion(T ) egin 1. C {c c is cndidte edge for T } 2. S Serch Grphs (T,C) 3. return {G G S, G is frequent, nd G hs the sme cnonicl spnning tree s T hs} Algorithm Serch Grphs(G, C = {c 1,c 2,...,c n}) egin 1. Q 2. for ech c i C 3. Q Q Serch Grphs(G c i, {c i+1,c i+2...,c n}) 4. for 5. return Q Tle 2: An lgorithm for eploring the equivlence clss of tree T 2.3 Optimiztions: Glol nd Locl Miml Sugrphs In this section, we eplore severl techniques for fst miml frequent sugrph mining. These techniques ( pruning techniques) dnmicll remove set of frequent sugrphs tht cn not e miml from serch spce. To tht, we define frequent sugrph G to e locll miml if it is miml in its equivlence clss i.e. G hs no frequent supergrph(s) tht shre the sme cnonicl spnning tree s G; we refer to sugrph s gloll miml if it is miml frequent in grph dtse. Clerl, ever glol miml sugrph must e locll miml ut not ever locl miml sugrph is necessril gloll miml. Our pruning techniques im to void enumerting sugrphs which re not locll miml. Not surprisingl, the prolem of finding ll locll miml frequent sugrphs cn e trnsformed to the well-known miml frequent itemset mining prolem. Ech cndidte edge is n item; the joining opertion cn e viewed s the union opertion for itemsets; nd ech locl miml sugrph corresponds to miml frequent itemset in its serch spce. Hence, we dvocte the following pruning techniques, which re prtill dpted from the miml itemset mining nd prtill developed in the grph mining contet, for miml frequent sugrph mining. 583

4 Reserch Trck Poster Bottom-Up Pruning The serch spce of grph G is eponentil in the crdinlit of the cndidte edges set C. One heuristic to void such n eponentil serch spce is to check whether the lrgest possile cndidte G = G C is frequent or not. If G is frequent, ech grph in the serch spce is sugrph of G nd hence not miml. This heuristics is referred to s the Bottom-Up Pruning nd cn e pplied to ever step in the recursive serch procedure presented in Tle 2. B ppling ottom-up pruning to the equivlence clss I presented in Figure 2, grph K S1 nd K S3 re pruned. Dnmic Reordering: An importnt technique relted to the efficienc of the ottom-up pruning is the so-clled dnmic reordering technique, which works in two ws. First, it trims infrequent cndidte edges from the til of grph to reduce the size of the serch spce (n edge cndidte cn ecome infrequent fter severl itertions since other edges re incorported into the ptterns). Second, it rerrnges the order of the elements in the til ccording to their support vlue. For emple, given grph s til C, dnmic reordering, we sort the elements in C their support vlues, from lowest to highest. After this sorting, the infrequent heds re trimmed. At the of the remining til is fmil of elements individull hving high support nd hence the pttern otined grouping them together is likel to still hve high support vlue. This heuristics is widel used in mining miml itemsets to gin performnce. However, without the spnning tree frmework, ppling dnmic ordering is ver difficult in n of the current sugrph mining lgorithms, which intrinsicll hve fied order of dding edges to n eisting pttern for vrious performnce considertions Til Shrink Given grph G nd supergrph G of G, nemedding of G in G is sugrph isomorphism f from G to G. We prefer the term emedding to sugrph isomorphism, though the re interchngele, for the purpose of intuitive descriptions. In Figure 4, we show sugrph L nd its supergrph P. There re two emeddings of L in P : (l 1 p 1,l 2 p 2,l 3 p 3,l 4 p 4) nd (l 1 p 1,l 2 p 3,l 3 p 2,l 4 p 4). We define cndidte edge (i, j, e l ) to e ssocitive to grph G if it ppers in ever emedding of G in grph dtse. In other words, cndidte edge (i, j, e l ) of G is ssocitive if nd onl if for ever emedding f of G in grph G, G hs the edge (f(i),f(j)) with lel e l. One emple of ssocitive edge is edge (l 1,l 3,) to the tree L shown in Figure 4. If tree T contins set of ssocitive edges {e 1,e 2,...,e n}, n miml frequent grph G which is supergrph of T must contin ll such edges. Hence we cn remove these edges from the til of T nd ugment them to T without missing n miml ones. This technique is referred to s the til shrink technique. Til shrink hs two dvntges: (1) it reduces the serch spce nd (2) it cn e used to prune the entire equivlence clss in certin cses. To elorte the ltter point, we define set of ssocitive edges C of tree T to e lethl if the resulting grph G = T C hs cnonicl spnning tree other thn tht of T. For emple, in Figure 4, ssocitive edge e =(1, 3,) of L is lethl since G = L e hs different cnonicl spnning tree thn tht of L. In the sme emple, the lethl edge e cn e ugmented to ech memer of the clss II to produce supergrph with the sme support. Therefore the whole clss cn e pruned w once we detect lethl edge(s) to the tree L. Detecting group of lethl edges cn do further pruning other thn trimming off the whole equivlence clss. Those detils s well s the forml proof of the optimiztion re discussed in [10]. l 2 l 3 l 1 l 4 Clss II L L S1 L S2 p 1 p 2 p 3 (P) p 4 Figure 4: An emple showing how til shrink might e used to prune the whole equivlence clss. Edge e =(l 1,l 3,), denoted dshed line to e distinguished from other edges, is ssocitive to tree L nd lethl to L s well. The grph otined joining L nd e should elong to equivlence clss I shown in Figure Eternl-Edge Pruning In this section, we introduce technique to remove one equivlence clss without n knowledge out its cndidte edges. We refer to this technique s the eternl-edge pruning. We define n edge to e n eternl edge for grph G if it connects node in G nd node which is not in G. We represent n eternl edge s three-element tuple (i, e l,v l ) to stnd for the fct tht we introduce n edge with lel e l incident on the node i in grph G nd node which does not elong to G with node lel v l. An eternl edge (i, e l,v l ) is ssocitive to grph G if nd onl if:. for ever emedding f of G in grph G, G hs node v with the lel v l, v connects to the node f(i) with n edge lel e l in G, nd node j V [G] such tht v = f(j). EXAMPLE 2.4. We show two emples of ssocitive eternl edges in Figure 5. One is (m 1,,) for the tree M nd nother one is (n f 1,,) for the tree N.IftreeT hs t lest one ssocitive eternl edge, the entire equivlence clss of T cn e pruned since the sme edge cn e ugmented to ever memer of the clss. In this emple, oth equivlence clsses IV nd V cn e eliminted due to the eternl-edge pruning. Once we find tree T hs n ssocitive eternl edge, the sme edge cn e ugmented to ech memers in T s equivlence clss nd therefore none of them re miml. Figure 5: Emples showing eternl edges nd ssocitive eternl edges. In rief summr, we present three pruning techniques to speed up miml sugrph mining. For the grph P shown in Figure 1, there re totl of twent five sugrphs of P, including itself 584

5 nd ecluding the null grph. These sugrphs re prtitioned into five non-singleton clsses, shown in Figure 2, nd twelve singleton clsses (not shown). There is onl one miml sugrph, nmel, grph P itself. We hve successfull pruned ever one of the five non-singleton equivlence clsses (P of the equivlence clss I is left untouched since it is miml). Wht we do not show further is tht we cn ppl the sme techniques to the remining twelve singleton equivlence clsses to eliminte ll of them. Interested reders might verif tht themselves. Tle 3 nd Tle 4 integrte these optimiztions into the sic enumerte technique we presented in Tle 1 nd Tle 2. Algorithm MSugrph-Epnsion(T ) egin 1. C {c c is cndidte edge for G} 2. A {c c C nd c is ssocitive } 3. ifa is lethl return #til shrinking 4. S Serch Grphs (T A, C A) 5. return {G G S, G is frequent, nd G hs the sme cnonicl spnning tree s T hs} Algorithm Serch Grphs(G, C = {c 1,c 2,...,c n}) egin 1. if G C is frequent, return G C #ottom-up pruning 2. Q 3. for ech c i C 4. Q Q Serch Grphs(G c i, {c i+1,c i+2...,c n}) 5. for 6. return Q Tle 3: An lgorithm for eploring the equivlence clss of tree T for miml sugrph mining Algorithm Miml Sugrph Mining(G,σ) egin 1. R {T T is frequent tree in G} 2. S {G G Epnsion(T ), T hs no eternl ssocitive edge, nd T R}#eternl-edge prunning 3. return {G G S nd G is miml } Tle 4: An outline of the miml sugrph mining lgorithm Due to the spce limittion, severl importnt detils re omitted which include: (1) how to enumerte frequent trees from grph dtse using modified FFSM lgorithm, (2) how to interleve the tree mining lgorithm nd the miml sugrph mining lgorithm nd deliver the finl optimized lgorithm, (3) how we gurntee tht ech reported pttern is () frequent, () miml, nd (c) unique, (4) how to clculte the edge cndidtes for tree, nd (5) how to determine ssocitive eternl edges. Those cn e found in [10]. 3. EXPERIMENTAL STUDY We performed our empiricl stud using single processor of 2.8GHz Pentium Xeon with 512KB L2 cche nd 2GB min memor, running RedHt Linu 7.3. The lgorithm is implemented using the C++ progrmming lnguge nd compiled using g++ with O3 optimiztion. We compred with two lterntive sugrph mining lgorithms: FFSM ([9]) nd gspn [19]. Ever miml sugrph reported in sntheticl nd rel dt sets re cross vlidted using results from FFSM nd gspn to mke sure it is () frequent, () miml, nd (c) unique. 3.1 Snthetic Dtset To evlute the performnce of the lgorithm, we first generte set of snthetic grph dtses using snthetic dt genertor [13]. In Figure 6, we represent the performnce comprison of, FFSM, nd gspn lgorithms for snthetic dt set with different support vlues. When the support is set to prett high vlue e.g. 5%, the performnce of ll three lgorithms re prett close. scles much etter thn the other two lgorithms s we decrese the support vlues. At support vlue 1%, provides si nd ten fold speed-up over FFSM nd gspn, respectivel. We do not show dt with support vlue gret thn 5% since there is little difference mong the three methods. More testing results on sntheticl dt sets cn e found in [10]. Run time (s) 10 1 FFSM gspn Totl identified sugrphs Reserch Trck Poster FFSM/gSpn Figure 6: Left: performnce comprison under different support vlues for dt set D10kT30L200I11V 4E4 using, FFSM nd gspn. Here we follow the common convention of encoding the prmeters of snthetic grph dtse s string. Right: Totl frequent ptterns identified the lgorithms. 3.2 Chemicl Dt Set We lso pplied to two widel used chemicl dt sets to test its performnce. The dt sets re otined from the DTP AIDS Antivirl Screen test, conducted U.S. Ntionl Cncer Institute. In the DTP dt set, chemicls re clssified into three sets: confirmed ctive (CA), confirmed modertel ctive (CM) nd confirmed inctive (CI) ccording to eperimentll determined ctivities ginst the HIV virus. There re totl of 423, 1083, nd chemicls in the three sets, respectivel. For our own purposes, we used ll compounds from CA nd from CM to form two dt sets, which re susequentl referred to s DTP CA nd DTP CM, respectivel. The DTP dt cn e downloded from dt.html. In Figure 7, we show the performnce comprison of, FFSM, nd gspn using the DTP CA dt set. We report tht is le to epedite the progrm up to five(eight) fold, compring with FFSM(gSpn) t support vlue 3.3%. Mining onl miml sugrphs cn reduce the totl numer of mined ptterns fctor up to three orders of mgnitude in this dt set. We lso pplied the sme lgorithms to the dt set DTP CM. In this cse, hs performnce ver close to FFSM nd oth re round eight fold speed-up over gspn. However, if we impose n dditionl constrint to let FFSM output the miml ptterns it finds mong the set of frequent ptterns, offers three fold speed-up from FFSM. 585

6 Reserch Trck Poster Run time (s) 10 1 FFSM gspn Totl identified sugrphs FFSM/gSpn Run time (s) 10 1 FFSM gspn Totl identified sugrphs FFSM/gSpn Figure 7: Left: performnce comprison under different support vlues for DTP CA dt set using, FFSM nd gspn. Right: Totl frequent ptterns identified the lgorithms. 4. RELATED WORK Knowledge discover from semi-structured dt sets is n ctive topic in the dt mining/mchine lerning communit. Mn different pttern definitions were proposed from different perspectives such s finding ptterns from single lrge network [14], finding pproimtel mtched ptterns [17], mining ptterns using domin knowledge from ioinformtics [8], nd finding frequent sugrphs. The lter one is the focus of our pper. Recent sugrph mining lgorithms cn e roughl clssified into two ctegories. Algorithms in the first ctegor use levelwise serch scheme sed on the Apriori propert to enumerte the recurrent sugrphs [12, 13]. Rther thn growing grph one single node/edge t time, Vnetik et l. recentl proposed n Apriori-sed lgorithm using pths s uilding locks with novel support definition [18]. Algorithms in the second ctegor use depth-first serch to enumerte cndidte frequent sugrphs [19, 20, 2, 9]. As demonstrted in these ppers, depth first lgorithms provide dvntges over level-wise serch for (1) etter memor utiliztion nd (2) efficient sugrph testing, e.g. it usull permits the sugrph test to e performed incrementll t successive levels during the serch [9]. Our current work enefits etensivel from eisting lgorithms for miml itemset mining such s [3, 6] nd frequent sutree mining lgorithms [1, 22]. 5. CONCLUSION AND FUTURE WORK In this pper we present, n lgorithm to mine miml frequent sugrphs from grph dtse. A new frmework, which prtitions frequent sugrphs into equivlence clsses is proposed together with group of optimiztion techniques. Compred to current stte-of-the-rt sugrph mining lgorithms such s FFSM nd gspn, offers ver good sclilit to lrge grph dtses nd t lest n order of mgnitude performnce improvement in snthetic grph dt sets. The efficienc of the lgorithm is lso confirmed enchmrk chemicl dt set. The lgorithm of compressing lrge numer of frequent sugrphs to much smller set of miml sugrphs will help us to investigte demnding pplictions such s finding structure ptterns from proteins in the future. Acknowledgement We thnk Dr. Jck Snoeink t the Universit of North Crolin for helpful discussions out the pper. 6. REFERENCES [1] T. Asi, K. Ae, S. Kwsoe, H. Arimur, nd H. Skmoto. Efficientl sustructure discover from lrge semi-structured dt. SDM, [2] C. Borgelt nd M. R. Berhold. Mining moleculr frgments: Finding relevnt sustructures of molecules. In Proc. Interntionl Conference on Dt Mining 02. [3] D. Burdick, M. Climlim, nd J. Gehrke. Mfi: A miml frequent itemset lgorithm for trnsctionl dtses. ICDE, [4] Y. Chi, Y. Yng, nd R. Muntz. Indeing nd mining free trees. ICDM, [5] A. Deutsch, M. F. Fernndez, nd D. Suciu. Storing semistructured dt with STORED. in SIGMOD, pges , [6] K. Goud nd M. J. Zki. Efficientl mining miml frequent itemsets. ICDM, [7] J. Hu, X. Shen, Y. Sho, C. Bstroff, nd M. J. Zki. Mining protein contct mps. 2nd BIOKDD Workshop on Dt Mining in Bioinformtics, [8] J. Hun, W. Wng, D. Bndopdh, J. Snoeink, J. Prins, nd A. Tropsh. Mining protein fmil specific residue pcking ptterns from protein structure grphs. In Eighth Annul Interntionl Conference on Reserch in Computtionl Moleculr Biolog (RECOMB), pges , [9] J. Hun, W. Wng, nd J. Prins. Efficient mining of frequent sugrphs in the presence of isomorphism. in ICDM 03, [10] J. Hun, W. Wng, J. Prins, nd J. Yng. Spin: Mining miml frequent sugrphs from grph dtses. UNC Technicl Report TR04-018, [11] J. Hun, W. Wng, A. Wshington, J. Prins, nd A. Tropsh. Accurtel clssif protein fmil sed on coherrent sugrph mining. in Pcific Smposium on Biocomputing, [12] A. Inokuchi, T. Wshio, nd H. Motod. An priori-sed lgorithm for mining frequent sustructures from grph dt. In Proc. of the 4th Europen Conf. on Principles nd Prctices of Knowledge Discover in Dtses (PKDD), pges 13 23, [13] M. Kurmochi nd G. Krpis. Frequent sugrph discover. In Proc. Interntionl Conference on Dt Mining 01. [14] M. Kurmochi nd G. Krpis. Finding frequent ptterns in lrge sprse grph. SDM, [15] J. Pei, G. Dong, W. Zou, nd J. Hn. On computing condensed frequent pttern ses. ICDM, [16] S. Rghvn nd H. Grci-Molin. Representing we grphs. In Proceedings of the IEEE Intl. Conference on Dt Engineering, [17] N. Vnetik nd E. Gudes. Mining frequent leled nd prtill leled grph ptterns. ICDE, [18] N. Vnetik, E. Gudes, nd E. Shimon. Computing frequent grph ptterns from semi-structured dt. Proc. Interntionl Conference on Dt Mining 02, [19] X. Yn nd J. Hn. gspn: Grph-sed sustructure pttern mining. In Proc. Interntionl Conference on Dt Mining 02. [20] X. Yn nd J. Hn. Closegrph: Mining closed frequent grph ptterns. KDD 03, [21] X. Yn, P. Yu, nd J. Hn. Grph Indeing: A Frequent Structure-sed Approch. SIGMOD 04, [22] M. Zki. Efficientl mining freqeunt trees in forest. SIGKDD,

Fast Frequent Free Tree Mining in Graph Databases

Fast Frequent Free Tree Mining in Graph Databases The Chinese University of Hong Kong Fst Frequent Free Tree Mining in Grph Dtses Peixing Zho Jeffrey Xu Yu The Chinese University of Hong Kong Decemer 18 th, 2006 ICDM Workshop MCD06 Synopsis Introduction

More information

Preview 11/1/2017. Greedy Algorithms. Coin Change. Coin Change. Coin Change. Coin Change. Greedy algorithms. Greedy Algorithms

Preview 11/1/2017. Greedy Algorithms. Coin Change. Coin Change. Coin Change. Coin Change. Greedy algorithms. Greedy Algorithms Preview Greed Algorithms Greed Algorithms Coin Chnge Huffmn Code Greed lgorithms end to e simple nd strightforwrd. Are often used to solve optimiztion prolems. Alws mke the choice tht looks est t the moment,

More information

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

Introduction to Algebra - Part 2

Introduction to Algebra - Part 2 Alger Module A Introduction to Alger - Prt Copright This puliction The Northern Alert Institute of Technolog 00. All Rights Reserved. LAST REVISED Oct., 008 Introduction to Alger - Prt Sttement of Prerequisite

More information

The Minimum Label Spanning Tree Problem: Illustrating the Utility of Genetic Algorithms

The Minimum Label Spanning Tree Problem: Illustrating the Utility of Genetic Algorithms The Minimum Lel Spnning Tree Prolem: Illustrting the Utility of Genetic Algorithms Yupei Xiong, Univ. of Mrylnd Bruce Golden, Univ. of Mrylnd Edwrd Wsil, Americn Univ. Presented t BAE Systems Distinguished

More information

are fractions which may or may not be reduced to lowest terms, the mediant of ( a

are fractions which may or may not be reduced to lowest terms, the mediant of ( a GENERATING STERN BROCOT TYPE RATIONAL NUMBERS WITH MEDIANTS HAROLD REITER AND ARTHUR HOLSHOUSER Abstrct. The Stern Brocot tree is method of generting or orgnizing ll frctions in the intervl (0, 1 b strting

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

The practical version

The practical version Roerto s Notes on Integrl Clculus Chpter 4: Definite integrls nd the FTC Section 7 The Fundmentl Theorem of Clculus: The prcticl version Wht you need to know lredy: The theoreticl version of the FTC. Wht

More information

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3 2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is

More information

Calculus Module C21. Areas by Integration. Copyright This publication The Northern Alberta Institute of Technology All Rights Reserved.

Calculus Module C21. Areas by Integration. Copyright This publication The Northern Alberta Institute of Technology All Rights Reserved. Clculus Module C Ares Integrtion Copright This puliction The Northern Alert Institute of Technolog 7. All Rights Reserved. LAST REVISED Mrch, 9 Introduction to Ares Integrtion Sttement of Prerequisite

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

Review of Gaussian Quadrature method

Review of Gaussian Quadrature method Review of Gussin Qudrture method Nsser M. Asi Spring 006 compiled on Sundy Decemer 1, 017 t 09:1 PM 1 The prolem To find numericl vlue for the integrl of rel vlued function of rel vrile over specific rnge

More information

Chapter 9 Definite Integrals

Chapter 9 Definite Integrals Chpter 9 Definite Integrls In the previous chpter we found how to tke n ntiderivtive nd investigted the indefinite integrl. In this chpter the connection etween ntiderivtives nd definite integrls is estlished

More information

Signal Flow Graphs. Consider a complex 3-port microwave network, constructed of 5 simpler microwave devices:

Signal Flow Graphs. Consider a complex 3-port microwave network, constructed of 5 simpler microwave devices: 3/3/009 ignl Flow Grphs / ignl Flow Grphs Consider comple 3-port microwve network, constructed of 5 simpler microwve devices: 3 4 5 where n is the scttering mtri of ech device, nd is the overll scttering

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016 CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

5.1 Estimating with Finite Sums Calculus

5.1 Estimating with Finite Sums Calculus 5.1 ESTIMATING WITH FINITE SUMS Emple: Suppose from the nd to 4 th hour of our rod trip, ou trvel with the cruise control set to ectl 70 miles per hour for tht two hour stretch. How fr hve ou trveled during

More information

5.7 Improper Integrals

5.7 Improper Integrals 458 pplictions of definite integrls 5.7 Improper Integrls In Section 5.4, we computed the work required to lift pylod of mss m from the surfce of moon of mss nd rdius R to height H bove the surfce of the

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy:

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy: Scnner Specifying ptterns source code tokens scnner prser IR A scnner must recognize the units of syntx Some prts re esy: errors mps chrcters into tokens the sic unit of syntx x = x + y; ecomes

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014 CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

Bases for Vector Spaces

Bases for Vector Spaces Bses for Vector Spces 2-26-25 A set is independent if, roughly speking, there is no redundncy in the set: You cn t uild ny vector in the set s liner comintion of the others A set spns if you cn uild everything

More information

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary Outline Genetic Progrmming Evolutionry strtegies Genetic progrmming Summry Bsed on the mteril provided y Professor Michel Negnevitsky Evolutionry Strtegies An pproch simulting nturl evolution ws proposed

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

Chapter 2. Random Variables and Probability Distributions

Chapter 2. Random Variables and Probability Distributions Rndom Vriles nd Proilit Distriutions- 6 Chpter. Rndom Vriles nd Proilit Distriutions.. Introduction In the previous chpter, we introduced common topics of proilit. In this chpter, we trnslte those concepts

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

4.6 Numerical Integration

4.6 Numerical Integration .6 Numericl Integrtion 5.6 Numericl Integrtion Approimte definite integrl using the Trpezoidl Rule. Approimte definite integrl using Simpson s Rule. Anlze the pproimte errors in the Trpezoidl Rule nd Simpson

More information

Random subgroups of a free group

Random subgroups of a free group Rndom sugroups of free group Frédérique Bssino LIPN - Lortoire d Informtique de Pris Nord, Université Pris 13 - CNRS Joint work with Armndo Mrtino, Cyril Nicud, Enric Ventur et Pscl Weil LIX My, 2015 Introduction

More information

CM10196 Topic 4: Functions and Relations

CM10196 Topic 4: Functions and Relations CM096 Topic 4: Functions nd Reltions Guy McCusker W. Functions nd reltions Perhps the most widely used notion in ll of mthemtics is tht of function. Informlly, function is n opertion which tkes n input

More information

The Trapezoidal Rule

The Trapezoidal Rule _.qd // : PM Pge 9 SECTION. Numericl Integrtion 9 f Section. The re of the region cn e pproimted using four trpezoids. Figure. = f( ) f( ) n The re of the first trpezoid is f f n. Figure. = Numericl Integrtion

More information

Section 4: Integration ECO4112F 2011

Section 4: Integration ECO4112F 2011 Reding: Ching Chpter Section : Integrtion ECOF Note: These notes do not fully cover the mteril in Ching, ut re ment to supplement your reding in Ching. Thus fr the optimistion you hve covered hs een sttic

More information

Chapter 6 Techniques of Integration

Chapter 6 Techniques of Integration MA Techniques of Integrtion Asst.Prof.Dr.Suprnee Liswdi Chpter 6 Techniques of Integrtion Recll: Some importnt integrls tht we hve lernt so fr. Tle of Integrls n+ n d = + C n + e d = e + C ( n ) d = ln

More information

Surface maps into free groups

Surface maps into free groups Surfce mps into free groups lden Wlker Novemer 10, 2014 Free groups wedge X of two circles: Set F = π 1 (X ) =,. We write cpitl letters for inverse, so = 1. e.g. () 1 = Commuttors Let x nd y e loops. The

More information

LINEAR ALGEBRA APPLIED

LINEAR ALGEBRA APPLIED 5.5 Applictions of Inner Product Spces 5.5 Applictions of Inner Product Spces 7 Find the cross product of two vectors in R. Find the liner or qudrtic lest squres pproimtion of function. Find the nth-order

More information

M344 - ADVANCED ENGINEERING MATHEMATICS

M344 - ADVANCED ENGINEERING MATHEMATICS M3 - ADVANCED ENGINEERING MATHEMATICS Lecture 18: Lplce s Eqution, Anltic nd Numericl Solution Our emple of n elliptic prtil differentil eqution is Lplce s eqution, lso clled the Diffusion Eqution. If

More information

1B40 Practical Skills

1B40 Practical Skills B40 Prcticl Skills Comining uncertinties from severl quntities error propgtion We usully encounter situtions where the result of n experiment is given in terms of two (or more) quntities. We then need

More information

What Is Calculus? 42 CHAPTER 1 Limits and Their Properties

What Is Calculus? 42 CHAPTER 1 Limits and Their Properties 60_00.qd //0 : PM Pge CHAPTER Limits nd Their Properties The Mistress Fellows, Girton College, Cmridge Section. STUDY TIP As ou progress through this course, rememer tht lerning clculus is just one of

More information

Chapter 2 Finite Automata

Chapter 2 Finite Automata Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht

More information

The Knapsack Problem. COSC 3101A - Design and Analysis of Algorithms 9. Fractional Knapsack Problem. Fractional Knapsack Problem

The Knapsack Problem. COSC 3101A - Design and Analysis of Algorithms 9. Fractional Knapsack Problem. Fractional Knapsack Problem The Knpsck Prolem COSC A - Design nd Anlsis of Algorithms Knpsck Prolem Huffmn Codes Introduction to Grphs Mn of these slides re tken from Monic Nicolescu, Univ. of Nevd, Reno, monic@cs.unr.edu The - knpsck

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS CS 310 (sec 20) - Winter 2003 - Finl Exm (solutions) SOLUTIONS 1. (Logic) Use truth tles to prove the following logicl equivlences: () p q (p p) (q q) () p q (p q) (p q) () p q p q p p q q (q q) (p p)

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018 CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA

More information

3 x x x 1 3 x a a a 2 7 a Ba 1 NOW TRY EXERCISES 89 AND a 2/ Evaluate each expression.

3 x x x 1 3 x a a a 2 7 a Ba 1 NOW TRY EXERCISES 89 AND a 2/ Evaluate each expression. SECTION. Eponents nd Rdicls 7 B 7 7 7 7 7 7 7 NOW TRY EXERCISES 89 AND 9 7. EXERCISES CONCEPTS. () Using eponentil nottion, we cn write the product s. In the epression 3 4,the numer 3 is clled the, nd

More information

2.4 Linear Inequalities and Interval Notation

2.4 Linear Inequalities and Interval Notation .4 Liner Inequlities nd Intervl Nottion We wnt to solve equtions tht hve n inequlity symol insted of n equl sign. There re four inequlity symols tht we will look t: Less thn , Less thn or

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz University of Southern Cliforni Computer Science Deprtment Compiler Design Fll Lexicl Anlysis Smple Exercises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sciences Institute 4676 Admirlty Wy, Suite

More information

Logarithms. Logarithm is another word for an index or power. POWER. 2 is the power to which the base 10 must be raised to give 100.

Logarithms. Logarithm is another word for an index or power. POWER. 2 is the power to which the base 10 must be raised to give 100. Logrithms. Logrithm is nother word for n inde or power. THIS IS A POWER STATEMENT BASE POWER FOR EXAMPLE : We lred know tht; = NUMBER 10² = 100 This is the POWER Sttement OR 2 is the power to which the

More information

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science CSCI 340: Computtionl Models Kleene s Theorem Chpter 7 Deprtment of Computer Science Unifiction In 1954, Kleene presented (nd proved) theorem which (in our version) sttes tht if lnguge cn e defined y ny

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

Finite Automata-cont d

Finite Automata-cont d Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww

More information

Mathematics Number: Logarithms

Mathematics Number: Logarithms plce of mind F A C U L T Y O F E D U C A T I O N Deprtment of Curriculum nd Pedgogy Mthemtics Numer: Logrithms Science nd Mthemtics Eduction Reserch Group Supported y UBC Teching nd Lerning Enhncement

More information

QUADRATURE is an old-fashioned word that refers to

QUADRATURE is an old-fashioned word that refers to World Acdemy of Science Engineering nd Technology Interntionl Journl of Mthemticl nd Computtionl Sciences Vol:5 No:7 011 A New Qudrture Rule Derived from Spline Interpoltion with Error Anlysis Hdi Tghvfrd

More information

Designing Information Devices and Systems I Spring 2018 Homework 7

Designing Information Devices and Systems I Spring 2018 Homework 7 EECS 16A Designing Informtion Devices nd Systems I Spring 2018 omework 7 This homework is due Mrch 12, 2018, t 23:59. Self-grdes re due Mrch 15, 2018, t 23:59. Sumission Formt Your homework sumission should

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Introduction to Electrical & Electronic Engineering ENGG1203

Introduction to Electrical & Electronic Engineering ENGG1203 Introduction to Electricl & Electronic Engineering ENGG23 2 nd Semester, 27-8 Dr. Hden Kwok-H So Deprtment of Electricl nd Electronic Engineering Astrction DIGITAL LOGIC 2 Digitl Astrction n Astrct ll

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

DFA minimisation using the Myhill-Nerode theorem

DFA minimisation using the Myhill-Nerode theorem DFA minimistion using the Myhill-Nerode theorem Johnn Högerg Lrs Lrsson Astrct The Myhill-Nerode theorem is n importnt chrcteristion of regulr lnguges, nd it lso hs mny prcticl implictions. In this chpter,

More information

FORM FIVE ADDITIONAL MATHEMATIC NOTE. ar 3 = (1) ar 5 = = (2) (2) (1) a = T 8 = 81

FORM FIVE ADDITIONAL MATHEMATIC NOTE. ar 3 = (1) ar 5 = = (2) (2) (1) a = T 8 = 81 FORM FIVE ADDITIONAL MATHEMATIC NOTE CHAPTER : PROGRESSION Arithmetic Progression T n = + (n ) d S n = n [ + (n )d] = n [ + Tn ] S = T = T = S S Emple : The th term of n A.P. is 86 nd the sum of the first

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

Satellite Retrieval Data Assimilation

Satellite Retrieval Data Assimilation tellite etrievl Dt Assimiltion odgers C. D. Inverse Methods for Atmospheric ounding: Theor nd Prctice World cientific Pu. Co. Hckensck N.J. 2000 Chpter 3 nd Chpter 8 Dve uhl Artist depiction of NAA terr

More information

Boolean Algebra. Boolean Algebra

Boolean Algebra. Boolean Algebra Boolen Alger Boolen Alger A Boolen lger is set B of vlues together with: - two inry opertions, commonly denoted y + nd, - unry opertion, usully denoted y ˉ or ~ or, - two elements usully clled zero nd

More information

Lecture Solution of a System of Linear Equation

Lecture Solution of a System of Linear Equation ChE Lecture Notes, Dept. of Chemicl Engineering, Univ. of TN, Knoville - D. Keffer, 5/9/98 (updted /) Lecture 8- - Solution of System of Liner Eqution 8. Why is it importnt to e le to solve system of liner

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below. Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we

More information

8.6 The Hyperbola. and F 2. is a constant. P F 2. P =k The two fixed points, F 1. , are called the foci of the hyperbola. The line segments F 1

8.6 The Hyperbola. and F 2. is a constant. P F 2. P =k The two fixed points, F 1. , are called the foci of the hyperbola. The line segments F 1 8. The Hperol Some ships nvigte using rdio nvigtion sstem clled LORAN, which is n cronm for LOng RAnge Nvigtion. A ship receives rdio signls from pirs of trnsmitting sttions tht send signls t the sme time.

More information

1 Online Learning and Regret Minimization

1 Online Learning and Regret Minimization 2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in

More information

10.2 The Ellipse and the Hyperbola

10.2 The Ellipse and the Hyperbola CHAPTER 0 Conic Sections Solve. 97. Two surveors need to find the distnce cross lke. The plce reference pole t point A in the digrm. Point B is meters est nd meter north of the reference point A. Point

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Lecture 6. Notes. Notes. Notes. Representations Z A B and A B R. BTE Electronics Fundamentals August Bern University of Applied Sciences

Lecture 6. Notes. Notes. Notes. Representations Z A B and A B R. BTE Electronics Fundamentals August Bern University of Applied Sciences Lecture 6 epresenttions epresenttions TE52 - Electronics Fundmentls ugust 24 ern University of pplied ciences ev. c2d5c88 6. Integers () sign-nd-mgnitude representtion The set of integers contins the Nturl

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

Connected-components. Summary of lecture 9. Algorithms and Data Structures Disjoint sets. Example: connected components in graphs

Connected-components. Summary of lecture 9. Algorithms and Data Structures Disjoint sets. Example: connected components in graphs Prm University, Mth. Deprtment Summry of lecture 9 Algorithms nd Dt Structures Disjoint sets Summry of this lecture: (CLR.1-3) Dt Structures for Disjoint sets: Union opertion Find opertion Mrco Pellegrini

More information

AQA Further Pure 2. Hyperbolic Functions. Section 2: The inverse hyperbolic functions

AQA Further Pure 2. Hyperbolic Functions. Section 2: The inverse hyperbolic functions Hperbolic Functions Section : The inverse hperbolic functions Notes nd Emples These notes contin subsections on The inverse hperbolic functions Integrtion using the inverse hperbolic functions Logrithmic

More information

CONIC SECTIONS. Chapter 11

CONIC SECTIONS. Chapter 11 CONIC SECTIONS Chpter. Overview.. Sections of cone Let l e fied verticl line nd m e nother line intersecting it t fied point V nd inclined to it t n ngle α (Fig..). Fig.. Suppose we rotte the line m round

More information

LAMEPS Limited area ensemble forecasting in Norway, using targeted EPS

LAMEPS Limited area ensemble forecasting in Norway, using targeted EPS Limited re ensemle forecsting in Norwy, using trgeted Mrit H. Jensen, Inger-Lise Frogner* nd Ole Vignes, Norwegin Meteorologicl Institute, (*held the presenttion) At the Norwegin Meteorologicl Institute

More information

The size of subsequence automaton

The size of subsequence automaton Theoreticl Computer Science 4 (005) 79 84 www.elsevier.com/locte/tcs Note The size of susequence utomton Zdeněk Troníček,, Ayumi Shinohr,c Deprtment of Computer Science nd Engineering, FEE CTU in Prgue,

More information

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom Lerning Gols Continuous Rndom Vriles Clss 5, 8.05 Jeremy Orloff nd Jonthn Bloom. Know the definition of continuous rndom vrile. 2. Know the definition of the proility density function (pdf) nd cumultive

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

Chapter 3 Single Random Variables and Probability Distributions (Part 2)

Chapter 3 Single Random Variables and Probability Distributions (Part 2) Chpter 3 Single Rndom Vriles nd Proilit Distriutions (Prt ) Contents Wht is Rndom Vrile? Proilit Distriution Functions Cumultive Distriution Function Proilit Densit Function Common Rndom Vriles nd their

More information

September 13 Homework Solutions

September 13 Homework Solutions College of Engineering nd Computer Science Mechnicl Engineering Deprtment Mechnicl Engineering 5A Seminr in Engineering Anlysis Fll Ticket: 5966 Instructor: Lrry Cretto Septemer Homework Solutions. Are

More information

Lecture 3: Equivalence Relations

Lecture 3: Equivalence Relations Mthcmp Crsh Course Instructor: Pdric Brtlett Lecture 3: Equivlence Reltions Week 1 Mthcmp 2014 In our lst three tlks of this clss, we shift the focus of our tlks from proof techniques to proof concepts

More information

Section - 2 MORE PROPERTIES

Section - 2 MORE PROPERTIES LOCUS Section - MORE PROPERTES n section -, we delt with some sic properties tht definite integrls stisf. This section continues with the development of some more properties tht re not so trivil, nd, when

More information

Interpreting Integrals and the Fundamental Theorem

Interpreting Integrals and the Fundamental Theorem Interpreting Integrls nd the Fundmentl Theorem Tody, we go further in interpreting the mening of the definite integrl. Using Units to Aid Interprettion We lredy know tht if f(t) is the rte of chnge of

More information

New data structures to reduce data size and search time

New data structures to reduce data size and search time New dt structures to reduce dt size nd serch time Tsuneo Kuwbr Deprtment of Informtion Sciences, Fculty of Science, Kngw University, Hirtsuk-shi, Jpn FIT2018 1D-1, No2, pp1-4 Copyright (c)2018 by The Institute

More information

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations.

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations. Lecture 3 3 Solving liner equtions In this lecture we will discuss lgorithms for solving systems of liner equtions Multiplictive identity Let us restrict ourselves to considering squre mtrices since one

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information

Math 1B, lecture 4: Error bounds for numerical methods

Math 1B, lecture 4: Error bounds for numerical methods Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the

More information

Expressivity versus Efficiency of Graph Kernels

Expressivity versus Efficiency of Graph Kernels Expressivity versus Efficiency of Grph Kernels Jn Rmon 1 nd Thoms Gärtner 2,3 1 Deprtment of Computer Science, K.U.Leuven, Belgium 2 Frunhofer Institut Autonome Intelligente Systeme, Germny 3 Deprtment

More information

Continuous Random Variable X:

Continuous Random Variable X: Continuous Rndom Vrile : The continuous rndom vrile hs its vlues in n intervl, nd it hs proility distriution unction or proility density unction p.d. stisies:, 0 & d Which does men tht the totl re under

More information

CS 330 Formal Methods and Models

CS 330 Formal Methods and Models CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2017 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 2 1. Prove ((( p q) q) p) is tutology () (3pts) y truth tle. p q p q

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

Farey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University

Farey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University U.U.D.M. Project Report 07:4 Frey Frctions Rickrd Fernström Exmensrete i mtemtik, 5 hp Hledre: Andres Strömergsson Exmintor: Jörgen Östensson Juni 07 Deprtment of Mthemtics Uppsl University Frey Frctions

More information

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a).

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a). The Fundmentl Theorems of Clculus Mth 4, Section 0, Spring 009 We now know enough bout definite integrls to give precise formultions of the Fundmentl Theorems of Clculus. We will lso look t some bsic emples

More information

Matching patterns of line segments by eigenvector decomposition

Matching patterns of line segments by eigenvector decomposition Title Mtching ptterns of line segments y eigenvector decomposition Author(s) Chn, BHB; Hung, YS Cittion The 5th IEEE Southwest Symposium on Imge Anlysis nd Interprettion Proceedings, Snte Fe, NM., 7-9

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More

More information

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS. THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 utomt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Prolem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) nton Setzer (Bsed on ook drft y J. V. Tucker nd K. Stephenson)

More information