Querying Communities in Relational Databases

Size: px
Start display at page:

Download "Querying Communities in Relational Databases"

Transcription

1 Querying Communities in Reltionl Dtbses Lu Qin, Jeffrey Xu Yu, Lijun Chng, Yufei To The Chinese University of Hong Kong, Hong Kong, Chin Abstrt Keyword serh on reltionl dtbses provides users with insights tht they n not esily observe using the trditionl RDBMS tehniques. Here, n l-keyword query is speified by set of l keywords, {k,k,,k l }. It finds how the tuples tht ontin the keywords re onneted in reltionl dtbse vi the possible foreign key referenes. Coneptully, it is to find some struturl informtion in dtbse grph, where nodes re tuples nd edges re foreign key referenes. The existing work studied how to find onneted trees for n l-keyword query. However, tree my only show prtil informtion bout how those tuples tht ontin the keywords re onneted. In this pper, we fous on finding ommunities for n l-keyword query. A ommunity is n indued subgrph tht ontins ll the l-keywords within given distne. We propose new effiient lgorithms to find ll/top-k ommunities whih onsume smll memory, for n l-keyword query. For topkl-keyword queries, our lgorithm llows users to intertively enlrge k t run time. We onduted extensive performne studies using two lrge rel dtsets to onfirm the effiieny of our lgorithms. I. INTRODUCTION Keyword serh on reltionl dtbses hs been widely studied in reent yers. It tkes reltionl dtbse s dtbse grph G D by onsidering the tuples s nodes nd the foreign key referenes s edges between nodes, nd serhes the hidden onnetions between those tuples tht ontin keywords speified in user-given l-keyword query (k,k,,k l ). Almost ll existing work im t finding the miniml onneted trees tht ontin ll the l-keywords in dtbse grph or in the underneth reltionl dtbse [], [], [3], [4], [5], [6], [7], where some foused on finding ll the miniml onneted trees nd some foused on finding the top-k miniml onneted trees. In this pper, we explore two key issues. The first is whether it is the best of users interest to find miniml onneted trees on dtbse grph G D, nd the seond is how to effiiently find subgrphs (insted of trees) for user-given l- keyword queries, if it is highly desirble. We disuss the first issue in the introdution, nd fous on the effiieny issue in the rest of the pper. Consider smll grph, G, shown in Fig. (). The grph G shows the o-uthorship nd the ittion between two ppers. There re 5 nodes (3 uthors nd ppers). The three uthors re: John Smith, Jim Smith, nd, nd two ppers re: pper nd pper. There re 6 edges. The pper, pper, ws o-uthored between John Smith nd, nd the pper, pper, ws o-uthored mong, John Smith, nd Jim Smith. In ddition, pper ited pper. The edges re weighted. The edge from pper to John Smith nd re nd, respetively, beuse John Smith ws the first uthor, nd ws the seond uthor. In similr fshion, the uthor order for the three uthors who wrote pper ws indited in the weights ssoited with the orresponding edges. We ssume tht the weight on the ittion edge from pper to pper is 4. pper 4 John Smith pper () Grph Fig.. 3 Jim Smith... t Center nodes Pth nodes k... kl Keyword nodes (b) Community Grph nd Community Next, onsider -keyword query, Kte nd Smith, ginst the smll grph. All the 5 trees, T i,for i 5, re listed in Fig.. T (Fig. ()) shows tht John Smith nd wrote pper pper. T (Fig. (b)) indites tht John Smith wrote pper, whih ws ited by pper written by Kte Green. Eh of the first 4 trees, T i,for i 4, infig. gives some piees of informtion between John Smith nd. But, none of the 4 trees give better whole piture of the reltionships between these two uthors. There re two problems. One is tht user my find some informtion, but t the sme time my miss some informtion he/she is relly interested in when the user is browsing the resulting trees. For exmple, user my wnt to find how mny ppers John Smith nd o-uthored. T only shows tht they o-uthored one pper. The other problem is tht the number of resulting trees n be lrge for n l-keyword query, nd it mkes diffiult for users to find ll informtion he/she needs. We propose to find ommunities (multi-enter grphs) for n l-keyword query. Fig. (b) illustrtes ommunity. It is n indued subgrph over set of nodes, nmely, keyword nodes, enter nodes, nd dditionl pth nodes. () For given l-keyword query, in ommunity, there re up to l keyword nodes. All the user-given l keywords pper in the keyword nodes. (b) There re enter nodes, where eh enter onnets to every keyword node within threshold lled rdius. In other words, the distne long the shortest pth between enter node nd keyword node is less thn or equl to the rdius. () The dditionl nodes, lled pth nodes, re the nodes tht pper on ny pth from enter node, v,to keyword node, v l, if the distne long suh pth between

2 John Smith pper pper 4 John Smith pper John Smith pper 4 pper John Smith pper pper 3 Jim Smith () T (b) T () T 3 (d) T 4 (e) T 5 Fig.. Five Trees pper 4 John Smith pper pper () R (-Center) (b) R (-Center) Fig. 3. Two Communities 3 Jim Smith v nd v l is less thn or equl to the rdius. Fig. 3 shows two suh ommunities for the sme -keyword query with rdius 6. Consider the ommunity (R ) in Fig. 3(). There re keyword nodes tht ontin the two keywords, Kte nd Smith, respetively. There re enter nodes indited by pper nd pper. Beuse the given rdius is 6, there is pth from pper to vi pper, with the totl weight 5( 6), the edge from pper to pper is lso inluded in the ommunity. The ommunity, R (Fig. 3()) inludes ll the informtion represented by the 4 trees, T i,for i 4 in Fig.. The semntis of suh ommunities hve been studied. The similr onepts re lso used in o-ittion nlysis, uthority/hub [8]. To our best knowledge, it is the first time tht the uthors study finding multi-enter ommunities in reltionl dtbses. In ddition, we study finding ommunities using rdius, whih is the minimum totl weight long pth from enter node to keyword node. It is different from other reported studies tht find the ore of web ommunities s biprtite grphs [8]. Contributions of this pper: () We propose generl onept lled ommunity s multi-enter direted grph in reltionl dtbse when the reltionl dtbse is onsidered s dtbse grph, G D. () We propose n lgorithm to enumerte ll ommunities in polynomil dely under querynd-dt omplexity [9]. We show tht the ommunities found re omplete nd duplition-free. By omplete, we men tht we find ll ommunities. We introdue wek duplitionfree onept under whih we design polynomil dely lgorithm with time omplexity of O(l (n log n + m)) nd spe omplexity of O(l n + m), where n nd m re the number of nodes nd the number of edges in G D, nd l is the number of user-given keywords. The polynomil dely enumertion lgorithms re onsidered s the best lgorithm when enumerting results [9], [0], [4]. (3) We propose polynomil dely lgorithm with the sme time omplexity of O(l (n log n+m)) nd spe omplexity of O(l k+l n+m), to find the ext top-k ommunities under rnking order. One min dvntge of our lgorithm is tht we llow user to intertively reset the vlue of k for finding the topk ommunities during run-time without overhed. (4) We propose n effiient indexing method to index the dtbse grph, G D. With suh n index, we show tht we n projet smll dtbse grph for n l-keyword query, nd find the sme set of ommunities. The serh spe n be signifintly redued. (5) We ondut extensive performne studies to onfirm the effiieny of our lgorithms using rel dtsets. Orgniztions: The reminder of the pper is orgnized s follows. Setion II gives our problem sttement. Setion III disusses severl possible solutions, nd highlights the min ides of our polynomil dely lgorithms fter introduing severl tegories of enumertion lgorithms. In Setion IV, we disuss our lgorithm to find ll ommunities, nd in Setion V, we disuss our lgorithm to find top-k ommunities. We lso introdue our indexing nd give n lgorithm to projet subgrph of G D for n l-keyword query to be evluted. Experimentl studies re given in Setion VII followed by disussions on relted work in Setion VIII. Finlly, Setion IX onludes the pper. II. PROBLEM STATEMENT Following the nottions used in [], [3], [4], [7], we model reltionl dtbse RDB s weighted direted grph, G D = (V,E), where V is the set of nodes (tuples) in RDB, nd E is the set of edges between nodes bsed on the foreign key referenes between the orresponding tuples in RDB. Here, node, v V, my ontin keywords, direted edge, (u, v) E, is ssoited with weight, denoted w e ((u, v)). Given two nodes u nd v, weusedist(u, v) to denote the totl weight long the shortest pth between u nd v. Inthe following disussions, we fous on generl direted grph. Our pproh n be esily pplied to undireted or bi-direted grphs [], [3], [7]. We use V (G) nd E(G) to denote the set of nodes nd the set of edges for given grph G. Welso denote the number of nodes nd the number of edges in the dtbse grph, G D,usingn = V (G D ) nd m = E(G D ). Fig. 4 shows n exmple of dtbse grph, G D, whih onsists of 3 nodes, v i,for i 3. Some nodes ontin keywords: v 4 nd v 3 ontin keyword, v nd v 8 ontin keywordb, nd v 3, v 6, v 9 nd v ontin keyword. All the edges re weighted, for exmple, w e ((v,v )) = 5. Given dtbse grph G D,nl-keyword query onsists of set of l keywords, {k,k,,k l }.Anl-keyword query is to find set of ommunities, R = {R (V,E),R (V,E), }, whih we define below.

3 v b v v 5 b v v3 b v () R 5 v 4 5 b 4 4 v 6 3 v v 7 5 Fig. 4. v 4 v 4 v 5 (b) R 5 v 8 5 v 5 v 9 3 v 0 v 3 3 A Simple Dtbse Grph, G D v 4 b v 8 v 6 v 9 Fig. 5. v 7 () R 3 b v8 v 9 Five ommunities 5 6 (d) R 4 b v 8 v 3 v 3 v 3 v 0 v (e) R 5 v v 3 Definition.: (Community) A ommunity, R i (V,E), is multi-enter indued subgrph of the dtbse grph G D. Here, V is union of three subsets, V = V V l V p.()v l represents set of nodes lled knodes (keyword nodes). Every knode v l V l ontins t lest keyword nd ll l keywords must pper in t lest one knode in V l.()v represents set of nodes lled nodes (enter nodes). For ny node v V, there exists t lest single pth suh tht dist(v,v l ) Rmx (rdius) between v nd every v l V l.(3)v p represents the nodes, lled pnodes, whih pper on ny pth from node v V to knode v l V l if dist(v,v l ) Rmx. Note tht node or pnode my lso ontin some keywords, nd node n be knode nd node t the sme time. E(R i ) is the set of edges, (u, v) E(G D ), for every pir of u nd v tht pper in V (R i ). It is worth noting tht ommunity, R i, is uniquely determined by knodes, V l, whih we ll the ore of the ommunity, nd denote it s ore(r i ). For simpliity, we use C to represent ore s list of l nodes, C =[,,, l ], nd my use C[i] to denote i C, where i ontins the keyword k i. Exmple.: Consider the dtbse grph G D (Fig. 4). Let Rmx =8. For3-keyword query {, b, }, 5 ommunities, R i,for i 5, re shown in Fig. 5. For exmple, for R 5 (Fig. 5 (e)), knodes (ore(r 5 ))rev l = {v 3,v 8,v }, nodes re V = {v,v }, nd pnodes rev p = {v 0 }. For ommunity, R i, ost funtion n be defined, denoted ost(r i ), using edge weights w e (). We define ost(r i ) s the minimum totl edge weight from node to every knode on the orresponding shortest pth, in this pper. For exmple, onsider the ommunity R 5 (Fig. 5 (e)). There re two enters, v nd v. The totl edge weight over the shortest pths from v to the 3 knodes, v 8, v, nd v 3, is = ( + 3) (3 + 3). The totl edge weight over the shortest pths from v to the 3 knodes, v 8, v, nd v 3,is 4=(3++3)+3+3. Therefore, ost(r 5 )=.Given For simpliity, we ignore node weights in this pper, nd our pproh n support node weights. two ommunities, R i nd R j, R i is rnked higher thn R j, denoted R i R j,ifost(r i ) ost(r j ). The highest rnk is number. The rnking for the 5 ommunities given in Fig. 5 is shown in Tble I. Note tht our work does not rely on speifi ost funtion. Rnk Knodes Grph Cost Center b v 4 v 8 v 6 R 3 7 {v 4,v 7} v 3 v 8 v 9 R 4 0 {v 9} 3 v 3 v 8 v R 5 {v,v } 4 v 4 v v 3 R 4 {v } 5 v 4 v v 9 R 5 {v 5} TABLE I RANKING Problem Sttement: In this pper we study two interrelted problems for n l-keyword query ginst dtbse grph G D, with user-given rdius Rmx, nmely, finding ll ommunities nd finding the top-k ommunities, for given l-keyword query. We denote them s COMM-ll nd COMMk, respetively. For both, the resulting ommunities must be omplete nd duplition-free. By omplete, we men tht we explore ll ombintions of the keywords to identify ommunities bsed on ll possible ores. By duplition-free, we men for ny two ommunities R nd R, ore(r) ore(r ). In other words, let C =[,,, l ] nd C = [,,, l ] be the two ores for the two ommunities, R nd R, C[i] C [i] for some i. We define duplition with the following onsidertion. A ommunity is uniquely determined by its ore, beuse otherwise the ost of heking whether two grphs re the sme is too expensive bsed on grph isomorphism. III. AN OVERVIEW In this setion, we disuss severl possible solutions nd ddress the effiieny we wnt to hieve. For proessing n l-keyword query, {k,k,,k l }, with Rmx, ginst G D,letV i be the set of nodes in V (G D ) tht ontin the keyword k i, nd let V i be the number of nodes in V i, for i l. Beuse ore C uniquely determines ommunity, we disuss how to find ll nd duplition-free ores in this setion, nd ddress the effiieny problems. First, we onsider nive pproh using the 3-keyword query in Exmple.. For proessing COMM-ll, it needs nested loop to hek ll ombintions of nodes tht ontin keywords s follows. : for v i V do : for v j V do 3: for v k V 3 do 4: form ore ndidte C[v i,v j,v k ]; 5: output C if there is enter whih onnets every nodes in C within Rmx; The three for-loops together hek every ombintion of the three sets, V, V, nd V 3, nd ompute every possible ores, C. The omplexity is O( V mx 3 ), where V mx =

4 mx{ V, V, V 3 }. In generl, for n l-keyword query, it is in nture exponentil O(n l ) where n = V (G D ). Sine it heks every distint ombintion of nodes tht ontin the three keywords, the result is omplete nd duplition-free. In order to find ommunities bsed on the user-given keywords, n expnding pproh n be dopted to solve our problem, whih is to expnd from nodes, step-by-step, until they n identify ommunities. First, we n expnd from ll the nodes in V i tht ontin the keyword, k i,for i l. We ll it bottom-up expnding, nd outline it below. : V V V V 3; : let eh u G D mintin l-sets where eh set, u.v i, keeps the nodes v V i tht u n reh within Rmx; 3: repet 4: find new node, w, tht is expnded from v i V within Rmx; 5: dd v i into w.v i if v i ontins the keyword k i; 6: if w nd ll w.v i re non-empty then 7: output new ores found; 8: until w = During the expnsion proess, when keyword node, v i V i expnds to node, u, it implies tht u n reh v i, nd we mintin v i in set denoted u.v i t the node u. In other words, the set u.v i mintins ll the nodes ontining keyword k i tht n be rehed from u within Rmx. Ifllu.V i,for i l, re non-empty, there exist ores of ommunities. The number of ore ndidtes t node u is O( u.v mx 3 ), where u.v mx = mx{ u.v, u.v, u.v 3 }. When it is to be output, the lgorithm first heks if the ndidte is duplition. For doing so, the lgorithm mintins pool of the lredy output ores. When new ore is to be output, it heks whether it hs lredy been in the pool. If it does not exist in the pool, the lgorithm will output it nd dd it into the pool. Seond, in similr fshion, we n expnd from ny node u V (G D ) intop-downfshionuptormx, nd hek whether it n ontin ores of ommunities in similr wy s to mintin u.v i s used in the bottom-up expnding pproh. Note tht both top-down/bottom-up expnding pprohes n find ll ommunities. In other words, the results re omplete nd duplition-free. A. Enumertion Dely In this pper, we investigte new novel enumertion lgorithms [9] for supporting COMM-ll nd COMM-k queries. To our problem, n enumertion lgorithm, A, outputs ll/topk ommunities, O =(R,R,,R O ), for n l-keyword query ginst dtbse grph G D, with rdius Rmx. We onsider the grph G D nd the l-keywords, s the input, nd denote it s I. The size of input, I, is I = n + m + l, where n nd m re the numbers of nodes nd edges of G D, respetively. Note tht Rmx is onstrint rther thn n input dt. The size of output is O, where in O R i R j if ore(r i ) ore(r j ) (duplition-free). First, onsider the COMM-ll queries. The nive pproh (nested loop), in worst se, tkes exp(i) time, tht is O(n l ), to output ll the results of size O. The omplexity of the nive pproh is irrelevnt to the output size, O, whih n be even muh lrger thn the input I. Therefore, even n enumertion lgorithm, A, is not polynomil to the input I, it my be seen s resonble, beuse, when O domintes, ll lgorithms need to output O. Therefore, it is requested to onsider the omplexity by tking both input, I, nd output, O, into onsidertion. In the literture [9], [], there re severl tegories of enumertion lgorithms, nmely, polynomil totl time, inrementl polynomil time, nd polynomil dely. Here, polynomil totl time mens tht the proessing time of the lgorithm, A, is polynomil to both sizes of input nd output, I + O. Theinrementl polynomil time implies tht the proessing time to output the o-th nswer, whih does not neessrily follow ny rnking order, is polynomil to the ombined size of the input nd the first o nswers output lredy, I + o. And, the polynomil dely implies tht the o-th output is output in time whih is polynomil only to I. Obviously, the best lgorithm is polynomil dely lgorithm. A bottom-up/top-down expnding lgorithm is not polynomil dely lgorithm, beuse it needs to hek the lredy output results in order to ensure duplition-free. A bottomup/top-down expnding lgorithm is n inrementl polynomil time lgorithm, beuse it is polynomil to the ombined size of input nd the size of results tht hve been generted. Seond, onsider the COMM-k queries, whih re to output the top-k ommunities in n order (rnking). In [], Lwler gives proedure (Lwler s proedure) to ompute the k best solutions to disrete optimiztion problems, nd shows tht if the number of omputtionl steps required to find n optiml solution to problem with l (0, ) vribles is (l), then the mount of omputtion required to obtin the k best solutions is O(l (l)). Kimelfeld et l. propose polynomil dely lgorithm tht dopts the Lwler s proedure to find top-k steiner trees for keyword serh problems [4]. Bsed on the Lwler s proedure, for COMM-k queries, it is strightforwrd to obtin n lgorithm of the time omplexity, O(l (l)), where (l) is the time omplexity to ompute the top- ommunity. However, in this pper, we propose new lgorithms to ompute COMM-k queries, whih is O((l)) insted of O(l (l)). B. New Enumertion Dely Algorithms We highlight the min ides of our novel polynomil dely lgorithms for proessing COMM-ll/COMM-k queries followed by detil disussions in the following setions. : find the first best ore, C; : while C do 3: output the ommunity bsed on C; 4: C Next( ); First, we disuss our lgorithm for proessing COMM-ll queries. As shown bove, it first finds the first best ore C. In the while loop, it outputs the ommunity bsed on C whih is

5 duplition-free. Here, Next() is preedure to determine the next ore. The min issue is how to enumerte ll (omplete). We explin it using the 3-keyword query in Exmple.. Suppose tht the first ore determined is C =[v,v b,v ] for n l-keyword query. Here, v, v b, nd v ontin the keyword,, b, nd, respetively. We need to ensure tht suh C will not be enumerted gin. In doing so, we divide the entire serh spe, V V V 3, into 4 subspes (l +): S : {v } {v b } {v }, S : (V {v }) V V 3, S 3 : {v } (V {v b }) V 3, nd S 4 : {v } {v b } (V 3 {v }). It is importnt to know the following fts. () S is the urrent ore found. (b) V V V 3 = S S S 3 S 4. It implies tht we n enumerte ll ores (omplete). () S i Sj = (i j) (duplition-free). In order to enumerte ll, we propose depth-first trversl lgorithm. Coneptully, there exists virtul root node, whih represents the entire serh spe, nd, s shown in this exmple, it hs 4 hild nodes (S, S, S 3, S 4 ) representing 4 subspes. Suppose tht we find the next best ore in one of the subspes, sy S 4. With the sme proedure, we further divide S 4 into 4 subspes in the similr wy in depth-first trversl fshion in trversing the virtul tree. The time omplexity of our lgorithm is O(l (n log(n)+m)) using spe O(l n+m). The similr ide n be esily extended to support ny l-keyword queries. Below, we outline our lgorithm for COMM-k queries. : H ; : find the first best ore, C; 3: H.enhep(C); 4: while H do 5: g H.dehep(); 6: output the ommunity bsed on g.c; 7: Next( ); For COMM-k queries, we need to output ommunities following its rnking order. In doing so, we use Fiboni hep (H). We explin our min ide using the sme exmple. We find the first best ore, C, in the entire spe V V V 3, nd we ensure the first ore found is the ore for the top- ommunity. We enhep C with other informtion into the hep H, nd enter the while loop. In the while loop, first, we dehep the ore C, with the smllest ost, from H. We ompute its ommunity nd output it. Then, we try to ll Next(). In Next(), we ttempt to find the next best ore in eh of the three subspes, S, S 3, nd S 4, individully. If we find the best ore, C i,ins i,for <i 4, we enhep C i to H. With H, the next best ore, with the smllest ost, n be seleted in the next itertion from ll the ores kept in H. We repetedly dehep one ore, C,fromH, identify the subspe where C is in, sy S, further divide S into l =3subspes, find the best ores in the 3 subspes, nd enhep them into H for finding the next best ore. The time omplexity of our lgorithm is lso O(l (n log(n) +m)) using spe O(l k + l n + m). Algorithm COMM-ll (G D, {k,k,,k l }, Rmx) Input: the dtbse grph (G D), the set of keywords {k,k,,k l }, rdius Rmx. Output: ll ommunities. : for i =to l do : V i the set of nodes in G D ontining k i; 3: S i V i; 4: N i Neighbor(G D,S i, Rmx); 5: (C, ost) BestCore(N,N,,N l ); 6: while C do 7: R GetCommunity(G D,C,Rmx); 8: output R ; 9: C Next(G D, C, Rmx); 0: Proedure Next(G D,C,Rmx) : for i =to l do : N i Neighbor(G D, {C[i]}, Rmx); 3: for i = l downto do 4: S i S i {C[i]}; 5: N i Neighbor(G D,S i, Rmx); 6: (C,ost ) BestCore(N,N,,N l ); 7: if C then 8: return C ; 9: S i V i; 0: N i Neighbor(G D,S i, Rmx); : return ; IV. FIND ALL COMMUNITIES The lgorithm for proessing COMM-ll isshowninalgo- rithm, where it tkes three inputs, the dtbse grph G D,the set of keywords, {k,k,,k l }, nd the rdius Rmx. First, for every keyword k i, it finds ll nodes in V (G D ) tht ontin k i, nd ssigns them into V i (line ), whih n be done using the full text index [] effiiently. For every V i, we introdue S i whih represents the urrently vilble subset of V i,s ndidtes, for finding next ommunity. Initilly, S i is set to be V i (line 3). For S i, it finds the subset of V (G D ), lled neighborset of S i nd denote s N i, by lling proedure Neighbor() (Algorithm ). In N i, every node v j must hve t lest one neighbor v k S i suh s dist(v j,v k ) Rmx. Note tht S i V i N i V (G D ). Then, it ttempts to output ll ommunities in the rest of the lgorithm. It finds the first ore of ommunity, C, ssoited with ost, by lling BestCore() (Algorithm 3) with ll neighborsets, N i,for i l (line 5). In the while loop, the unique ommunity, R, for non-empty ore C is determined by GetCommunity() (Algorithm 4). The while loop repets by lling the Next() proedure, whih finds the next ore, C. In Next(), there re two min prts: the preprtion phse (line -), nd the serh phse to find the next ore (line 3-0). We explin the two min prts below. Rell tht C is represented s list of l nodes, C =[,,, l ], where i = C[i] ontins the keyword k i. It ttempts to find the next ore, C =[,,, l ] whih ontins t lest one node i i C. It is importnt to note tht C C, beuse For simpliity, we ssume tht ll vribles, V i nd S i,for i l, used re globl vribles, nd we do not need to pss them to the proedure Next().

6 in the lst itertion, t lest i i sine i is removed from S i (line 4). In ddition, C is not only different from the urrent, C, but lso different from ny ore found up to this stge (duplition free). Finlly, the proedure Next() serh the entire serh spe, nd does not miss ny possible ores. We will explin these issues in detil lter, fter showing n exmple. b v 4 S S S 3 N N v8 v 6 N3 () initil V V V 3 v 4 b v 8 v 6 N S 3 N 3 N V V V 3 (b) fter [v 4,v 8,v 6 ] Fig. 6. Finding ores v 4 b v 8 S S 3 N N v 4 v v 3 N3 () [v 4,v,v 3 ] Reonsider the dtbse grph G D in Fig. 4 for the 3- keyword query, k =, k = b, nd k 3 =, with Rmx = 8. Initilly, fter line 4, S = V = {v 4,v 3 }, S = V = {v 8,v }, nd S 3 = V 3 = {v 6,v 3,v 9,v }.Also, the three neighborsets re: N = {v, v 4, v 5, v 7, v 8, v 9, v, v, v 3 }, N = {v, v, v 4, v 5, v 7, v 8, v 9, v 0, v,v }, nd N 3 = {v, v, v 3, v 4, v 5, v 6, v 7, v 9, v, v }.The BestCore() will identify ore bsed on the nodes in the intersetion of N N N 3 = {v,v 4,v 5,v 7,v 9,v,v }, beuse only node in the intersetion of the neighborsets n possibly serve s enter to onnet the nodes tht ontin ll three keywords. Suppose tht BestCore() identifies ore C =[v 4,v 8,v 6 ] entered t v 7 with ost of 7 (line 5) (Refer to Fig. 6()). The first ommunity bsed on the ore C is uniquely identified s R 3 in Tble I (Fig. 5 ()), nd is output (line 7-8). Then, it lls Next() to find the next ore (line 9). In Next(), it ttempts to find the next ore bsed on the urrent ore C =[v 4,v 8,v 6 ]. It omputes the three neighborsets, N, N, nd N 3, for the three nodes in C regrding them s the enter, respetively. After line, N = {v,v 4,v 5,v 7 }, N = {v 4,v 7,v 8,v 9,v 0,v,v }, nd N 3 = {v 4,v 6,v 7 }. Then, in the for loop, initilly i =3. Note tht C =[v 4,v 8,v 6 ], C[3] = v 6, nd S 3 = {v 3,v 9,v } fter removing C[3] = v 6 from S 3 (line 4). It implies tht the next ore should not ontin v 6 in S 3. It reomputes N 3 using S 3, N 3 is reset to be N 3 = {v,v,v 3,v 5,v 9,v,v } (line 5). Then, it ttempts to find the next ore by lling BestCore() using the three newly omputed neighborsets, N, N, nd N 3. However, s n be seen, the intersetion of N N N 3 =, therefore, it is impossible to find ore. BestCore() will return n empty C (line 6)(Refer to Fig. 6(b)). Beuse C =, it will move to the next itertion. Before returning to the min while loop, it resets S 3 V 3 (line 9), beuse ny new ombintion, to form new ore C in the next V V V 3 Algorithm Neighbor(G D, V i, Rmx) Input: G D is the dtbse grph, nd V i V (G D) Output: neighborset of V i within Rmx. : let G t(v t,e t) be virtul grph suh s V t = V (G D) {t} nd E t = E(G D) {(v, t) v V i} where every w e((v, t)) = 0; : run the Dijkstr s lgorithm to find the shortest pths from ll nodes in V t to t; {onsider (u, v) E t s (v, u) (reverse order)} 3: N i {u dist(u, t) Rmx u V (G D)}; 4: return N i; itertion, n possibly ontin ny node in the entire V 3.It lso reomputes N 3 = {v,v,v 3,v 4,v 5,v 6,v 7,v 9,v,v }. In the next itertion, for i =, it repets the similr proedure strting from S S {v 8 } = {v } beuse C[] = v 8. The new neighborset N beomes {v,v,v 5 } (line 5), nd C =[v 4,v,v 3 ] (line 6). Sine C, it returns the new ore C in line 8 (Refer to Fig. 6()). Fig. 6 shows the min ides. The three sets, V i,for i 3, re represented s three retngles. In retngle, the shded prt is the subset of V i tht does not need to be serhed in n itertion. The irles represent neighborsets. A. The Three Subproblems In this setion, we disuss the detils of the three proedures used in Algorithm, nmely, Neighbor(), BestCore(), nd GetCommunity(). The lgorithm for Neighbor() is shown in Algorithm. It tkes three inputs: the dtbse grph G D, the set of nodes V i where every node v V i ontins the keyword k i, nd the rdius Rmx. TheNeighbor() will find the neighborset of V i, denoted N i, suh tht every node u N i hs t lest node v V i where dist(u, v) Rmx. The nodes in N i hve the potentil to be node in ommunity. Obviously, V i N i. In Algorithm, it onstruts virtul grph G t (V t,e t ) where V t = V (G D ) {t} nd E t = E(G D ) {(v, t) v V i } (line ). In other words, the virtul grph G t hs one dditionl sink node, t, nd dditionl edges from every v V i to t. For newly dded edge (v, t), the weight, w e ((v, t)), is set to be zero. Then, it runs Dijkstr s lgorithm to find the shortest pths from the newly dded sink node t to ll nodes in V t, by onsider every edge (u, v) E t s (v, u) (reverse order) (line ). Then, it identifies the neighborset of V i,s N i = {u dist(u, t) Rmx u V (G D )}. It is interesting to note tht, beuse the shortest pth from ny node u N i to the sink node t must bypss node v V i nd the weight from, v V i,tot is zero, if dist(u, t) Rmx, foru N i, the node u must hve t lest ner node v V i suh s dist(u, v) Rmx. The time omplexity of Algorithm is the time omplexity of Dijkstr lgorithm O(n log(n)+m) where n = V (G D ) nd m = E(G D ) for given dtbse grph G D. For every node, u, in the omputed neighborset, N i, we store the nerest node v V i tht ontin the keyword k i, nd the shortest distne, nd denote them s sr(n i,u) nd min(n i,u), respetively. The spe omplexity for N i is O(n).

7 Algorithm 3 BestCore({N,N,,N l }) Input: ll l neighborsets. Output: the best ore nd its ost. : C ; best + ; : N l i= Ni; 3: for ll u N do 4: nerestcore(u); 5: if ost() < best then 6: C ; best ost(); 7: return (C, best); The BestCore() lgorithm is shown in Algorithm 3. It tkes l neighborsets s input, nd finds the best ore C =[,,, l ] where i ontins the keyword k i. Note tht V i N i,for i l. It omputes the intersetion of ll neighborsets N i, for i l, denoted N (line ). Every node u N must be ble to serve s enter to form ore C =[,,, l ] beuse dist(u, i ) Rmx for every i C. In for loop (line 3-6), the ore C with the smllest ost(c) is determined. Here, nerestcore(u) identifies the ore entered t u, ost() omputes the ost of the ore. We hieve O(n) time to find the best next ore with some preprtion whih is done by shring the omputtionl ost done in Neighbor() using dditionl dt strutures. In implementtion, we mintin dt struture, with three elements, for every node u V (G D ). The first element is list of l pirs. The i-th pir mintins the nerest node of u, syv i, tht ontins the keyword k i s well s the totl weight long the pth between u nd v i (dist(u, v i )). For the i-th pir, we reord (v i, dist(u, v i )) if there exits v i V i nd dist(u, v i ) Rmx, otherwise we reord (, + ). The seond element mintins the totl weight l i= dist(u, v i) if dist(u, v i ) Rmx. The third element keeps how mny v i, for i l, dist(u, v i ) Rmx. If the ounter is l, it implies tht the orresponding u n be possible enter in ommunity. The spe for the tble is O(l n). We updte the dt struture while omputing neighborsets without dditionl ost in terms of time omplexity. With suh dt struture vilble, in BestCore(), we only need to sn the dt struture one nd then find the ore with the smllest ost. The lgorithm for GetCommunity() is shown in Algorithm 4. It tkes three inputs, the dtbse grph G D, ore C, nd Rmx, nd uniquely determines ommunity, R(V, E), bsed on the ore C. Note tht ommunity bsed on ore C is n indued subgrph R(V, E) where V = V V l V p.here V l is the set of knodes nd V l = C. We need to determine the set of nodes V nd the set of pnodes V p. First, we identify the set of nodes, V, where eh node v V n reh every knode C, suh s dist(v, ) Rmx (line ). In order to find the set of nodes, V, we onstrut virtul grph G (V,E ) where V = V (G D ) nd E = {(u, v) (v, u) E(G D )}, for given dtbse grph G D. In our implementtion, for every node, u V (G D )(= V (G )), we keep pir of numbers, nmely, u.sum nd Algorithm 4 GetCommunity(G D, C, Rmx) Input: the grph G D,oreC, nd the rdius Rmx. Output: ommunity uniquely determined by the ore C. : find the set of nodes, V, where eh node v V n reh every C within Rmx; : let G s(v s,e s) be virtul grph suh s V s = V {s} nd E s = E {(s, v) v V } where every w e((s, v)) = 0; 3: run the Dijkstr s lgorithm to find the shortest pths from s to ll nodes in V s; 4: let G t(v t,e t) be virtul grph suh s V t = V {t} nd E t = E {(u, t) u C} where every w e((u, t)) = 0; 5: Let every (u, v) E t be (v, u), run the Dijkstr s lgorithm to find the shortest pths from t to ll nodes in V t; 6: V {u dist(s, u)+dist(u, t) Rmx u V (G D)}; 7: onstrut n indued subgrph R in G D inluding ll nodes in V; 8: return R; s Fig. 7. v v ommunity v 3 v0 v 8 b v Finding the ommunity for ore u.ount. Both re initilized zero. Then, for eh knode, C, we ompute the shortest pths from to ll the other nodes using Dijkstr s lgorithm. For every u V (G D ),if dist(u, ) Rmx, we updte u.sum u.sum + dist(u, ) nd u.ount u.ount +. There re l knodes inc, nd we run Dijkstr s lgorithm l times. If u.ount = l, for u V (G D ), it indites tht u n be dded into the set of enters, V. Seond, bsed on V l = C nd V omputed, we ompute V s follows. () We onstrut virtul grph G s (V s,e s ) where V s = V (G D ) {s} nd E s = E {(s, v) v V }. The weight for eh newly dded edge (s, v) is set to be zero. Like Neighbor(), it runs Dijkstr s lgorithm to ompute the shortest pths from s to ll the other nodes (line -3). After line 3, every node, u, is ssoited with ounter reording the distne dist(s, u), ifdist(s, u) Rmx. () We onstrut nother virtul grph G t (V t,e t ) where V t = V (G D ) {t} nd E t = E {(v, t) v C}. The weight for eh newly dded edge (v, t) is set to be zero. We tret the grph G t s reversed grph by virtully deling with (v, u) E t s (u, v). Agin, like Neighbor(), we run Dijkstr s lgorithm to ompute the shortest pths from t to ll the others (line 4-5). Every node, u, is ssoited with nother ounter reording the distne dist(u, t) if dist(u, t) Rmx. Then V is omputed by seleting the nodes u V (G D ) if dist(s, u)+dist(u, t) Rmx (line 6). Note tht V V, C V. The totl time omplexity for Algorithm 4 is O(l (n log(n)+m)). We explin Algorithm 4 using the dtbse grph G D in Fig. 4 for the 3-keyword query, k =, k = b, nd k 3 =, with Rmx =8.LetR(V, E) be the ommunity for ore C =[v 3,v 8,v ]. Here, V = V V l V p. V l is the given set t

8 of knodes, C, V = {v,v }, nd V p = {v 0 }.Thesetof edges, E, n be esily identified by snning E(G D ).Fig.7 shows the ommunity found, where s nd t re two virtul nodes used in GetCommunity(). B. The Time/Spe Complexity Theorem IV.: Algorithm enumertes ommunities in polynomil dely time, O(l (n log(n) +m)), with the spe omplexity of O(l n + m). Proof Sketh: The time omplexity to get ommunity from ore(line7)iso(l (n log(n) +m)), we only need to prove tht the omplexity to get the next ore (line 9) is O(l (n log(n)+m)). Lines - invoke l times of Neighbor(), whih osts O(l (n log(n)+m)). Lines 3-0 loop for t most l times. In eh itertion, we invoke Neighbor() times, whih osts O(n log(n) +m) nd time of BestCore(). Note tht BestCore() is O(n). So the totl time omplexity for Next() beomes O(l (n log(n)+m)) nd the totl time omplexity for Algorithm to enumerte eh nswer is lso O(l (n log(n) +m)). For the spe omplexity, we need to reord l vlues for eh enter v, the best ore entered t v, using spe O(n l), nd lod the dtbse grph, G D, using spe O(n + m). AllS i nd N i ( i l) ost spe O(n l). The totl spe omplexity is O(n l + m). V. FIND TOP-K COMMUNITIES The lgorithm for COMM-k is shown in Algorithm 5. It tkes four inputs, the dtbse grph G D,thesetofkeywords, {k,k,, k l }, the rdius Rmx, nd n integer k > 0, nd outputs the top-k ommunities. We first ompute the set of nodes, S i, tht ontin the keyword k i, nd ompute its neighborset for S i,for i l (line -3). And we ompute the first nd the best ore, C (line 4). In order to find the top-k ommunities, we use dt struture, lled n-list, to mintin list of ore ndidtes mong them the top-k ore nd its ommunity n be identified. The mximum size of the pool is l k t most, whih we will explin lter in detil. A ndidte ore is kept in 4-element tuple, lled n-tuple, in the form of (C,ost,pos,prev). Here, C is the ore of ommunity, ost is the minimum totl weight from the nerest enter, denoted u, ofc, to every node i C, for i l, suh s l i= dist(u, i). Theprev points to its previous ndidte in the n-list. We explin pos using n exmple. Consider two n-tuple, x =(C, ost,, ) nd x =(C,ost,i,x), nd suppose C =[,,, i,, l ] nd C =[,,, i,, l ].Theprev in the n-tuple x points to the n-tuple x. The position, pos = i, inx, mens tht, by ompring C nd C kept in the two ndidtes, j = j if j<i, i i, nd j nd j my or my not be thesmeifj>i. Over the n-list, we use Fiboni hep, denoted H, whih is initilized to be empty (line 5). In the following, when we enhep n-tuple to H, we men to insert it into the n-list, nd then keep pointer in H pointing to the n-tuple on the n-list. When we dehep n-tuple from Algorithm 5 COMM-k (G D, {k,k,,k l }, k, Rmx) Input: the dtbse grph (G D), the set of keywords {k,k,,k l }, rdius Rmx, ndk > 0 Output: the top k ommunities. : for i =to l do : S i the set of nodes in G D ontining k i; 3: N i Neighbor(G D,S i, Rmx); 4: (C, ost) BestCore(N,N,,N l ); 5: H ; 6: H.enhep(C, ost,, ); 7: while H do 8: g H.dehep(); {g =(C, ost, pos, prev)} 9: G GetCommunity(g.C); 0: output G ; : k k ; : if k =0 then 3: return; 4: Next(g); 5: Proedure Next(g) 6: for i =to l do 7: N i Neighbor({g.C[i]}); 8: S i the set of nodes in G D ontining k i; 9: h g; 0: while h do : i h.pos; : S i S i {h.c[i]}; 3: h h.prev; 4: for i = l downto g.pos do 5: S i S i {g.c[i]}; 6: N i Neighbor(S i); 7: (C,ost) BestCore(N,N,,N l ); 8: if C then 9: H.enhep(C,ost,i,g); 30: S i S i {g.c[i]}; 3: N i Neighbor(S i); H, we simply remove the pointer from H, but still mintin the n-tuple on the n-list. The omplexity for H.enhep() nd H.dehep() is O() nd O(log n), respetively. It first enheps n-tuple for the first found ore C with its ost (line 6). Beuse it is the first n-tuple to be mintined in the n-list, its prev =, nd its pos =. The following while loop repets when H is non-empty (line 7-4). When H, it deheps H nd ssigns it to n-tuple, g (line 8). Beuse H is mintined in n sending order, the n-tuple, g, is with the smllest ost mong others in H. It will ll GetCommunity() to output the ommunity for the ore g.c (line 9). Then, it dereses k by, nd heks if it hs lredy output k ommunities (line -3). If k 0, it then dd more n-tuples into H, by lling the proedure Next() (line 4). In the proedure Next(), it onduts three min things. First, in preprtion phse, it omputes the neighborset for every node in the ore of the deheped n-tuple g, g.c[i], for i l, nd it lso reomputes S i s to inlude ll nodes in G D tht ontin the keyword k i. Seond, it removes those ndidtes tht hve been onsidered before (line 0-3). This proess limits the serh spe to speifi subspe out of l subspes. Third, it dpts the similr ide used in Algorithm tofindthenextl ndidtes, nd enhep them into H.

9 Hep Pool R 3 ( v 4, v 8, v 6 ) 7 Nil Core Cost Pos Prev Hep R5 R Hep R () initil Pool ( v 3, v 8, v ) ( v 3, v 8, v 9 ) 0 ( v 4, v, v 3 ) 4 ( v 4, v 8, v 6 ) 7 Nil Core Cost Pos Prev () nd itertion Pool ( v 4, v, v 9 ) 5 ( v 3, v 8, v ) ( v 3, v 8, v 9 ) 0 ( v 4, v, v 3 ) 4 ( v 4, v 8, v 6 ) 7 Nil Core Cost Pos Prev Hep R4 R Hep R Hep Pool ( v 3, v 8, v 9 ) 0 ( v 4, v, v 3 ) 4 ( v 4, v 8, v 6 ) 7 Nil Core Cost Pos Prev (b) st itertion Pool ( v 3, v 8, v ) ( v 3, v 8, v 9 ) 0 ( v 4, v, v 3 ) 4 ( v 4, v 8, v 6 ) 7 Nil Core Cost Pos Prev (d) 3rd itertion Pool ( v 4, v, v 9 ) 5 ( v 3, v 8, v ) ( v 3, v 8, v 9 ) 0 ( v 4, v, v 3 ) 4 ( v 4, v 8, v 6 ) 7 Nil Core Cost Pos Prev (e) 4th itertion (f) 5th itertion Fig. 8. Finding top-k ommunities Fig. 8 shows the pool (n-list) nd hep H when finding the top-5 ommunities in Exmple.. The 5 ommunities, R i,for i 5, re listed in Fig. 5. Theorem V.: The time omplexity for Algorithm 5 is O(l (n log(n)+m)) using O(l k + l n + m) spe. Proof Sketh: For the time omplexity, we only need to prove tht the hep opertions (line 8 nd line 9) do not hve impts on the omplexity of the lgorithm nd Lines 0-3 is done in omplexity O(l n). The omplexity for the other prts is ll the sme s in Algorithm. In Algorithm 5, every itertion, we dehep n-tuple from H nd enhep t most l n-tuples into H in order to get the next high rnked ommunity. Suppose we hve output p ommunities lredy. There re t most p l n-tuples in H. Note tht in totl p n l. Using the Fiboni hep, the dehep() osts O(log(p l)) O(log(n l l)) = O(l log(n)+log(l)) O(l n) time. The enhep() only osts O() time. Therefore, the hep opertions do not ffet the time omplexity of the lgorithm. Lines 0-3 remove some lredy used nodes from S i, whose time ost is t most l i= S i = O(n l). It does not ffet the time omplexity of Algorithm 5. The time omplexity to get eh ommunity is O(l (n log(n)+m)). For the spe omplexity, we need to mintin up to p l ntuples in the pool, where p is the number of urrently output ommunities. The spe omplexity for the other prts is ll the sme s in Algorithm. For eh generted ommunity, we hve to reord its ore using spe O(l), so the totl spe to reord ll the generted ores is O(l p), nd the totl spe omplexity for Algorithm 5 to generte the k-th best nswer is O(l k + l n + m). VI. INDEXING AND GRAPH PROJECTION As stted before, the time omplexity for finding ll/topk ommunities for user-given l-keyword query ginst dtbse grph, G D, with Rmx, is polynomil dely of R R 5 R 4 R R 3 Algorithm 6 GrphProjetion({k,k,,k l }, Rmx) Input: the set of keywords {k,k,,k l }, rdius Rmx. Output: projeted grph G P ( G D). : V ; E ; V ; W = ; : for i =to l do 3: W i getnode(invertedn,k i); 4: E i getedge(invertede,k i); 5: V i W i {u (u, v) E i (v, u) E i}; 6: W W W i; 7: E E E i; 8: V V V i; 9: V V i if i =, otherwise V V i; 0: let G s(v,e ) be virtully grph with new virtul node s,nd set of new edges from (s,v), forv V,wherew e((s,v)) = 0; : ompute the shortest pths from s to others over G s; : let G t(v,e ) be virtully grph with new virtul node t, nd set of new edges from (v, t), forv W,wherew e((v, t)) = 0; 3: ompute the shortest pths from t to others, on G t, by virtully onsidering (u, v) E s (v, u) (reverse the order); 4: V P {v v V dist(s,v)+dist(v, t) Rmx}; 5: E P {(u, v) u V P v V P (u, v) E }; 6: return G P (V P,E P ); O(l (n log(n)+m)). However, when G D is lrge in size, it is still ostly to proess COMM-ll/COMM-k queries. In order to redue the serh spe, in this setion, we introdue n index tht n be used to projet smll grph G P G D to support l-keyword queries with rdius up to R, whih is the lrgest Rmx users n use. In brief, the result for given l-keyword query ginst G D is the sme s the result for the sme query ginst the projeted dtbse grph G P. For n l-keyword query, the projeted grph G P n be muh smller thn G D. We use two inverted indexes, invertedn nd invertede. For eh keyword w in the dtbse grph G D,intheinvertedN, it mintins n invert list to store the set of nodes, denoted V w, where every node v V w ontins the keyword w, nd in the invertede, it mintins the set of edges, (u, v), suh tht both u nd v nodes re within R from t lest one node in V w. The node/edge weights re kept with the nodes nd the edges in the two inverted indexes. Next, we show how to use the two inverted indexes to projet dtbse grph for n l-keyword query. Note tht with the two inverted indexes, we do not need to use the underneth grph G D, or in other words, the entire G D n be onstruted using the two inverted indexes. The lgorithm to projet subgrph of G D for n l-keyword query within Rmx is shown in Algorithm 6. The min ide is the sme s to find ommunity for given ore, s illustrted in Fig. 7 for given set of nodes (v nd v ) nd given set of knodes (v 3, v 8, nd v ). When projeting grph G P, in Algorithm 6, the set of nodes beomes V nd the set of knodes beomes W. As shown in Algorithm 6, in for loop (line -9), for every keyword k i, it obtins the set of nodes tht ontin k i, W i,usinggetnode(invertedn,k i ) (line 3); nd it obtins the set of edges, E i,usinggetedge(invertede,k i ),

10 Prmeter Rnge Defult KWF.0003,.0006,.0009,.00, l, 3, 4, 5, 6 4 Rmx 4, 5, 6, 7, 8 6 k 50, 00, 50, 00, TABLE II PARAMETERS FOR DBLP DATASET KWF Keywords.0003 slble, protools, distne, disovery.0006 spe, grph, routing, sheme.0009 environment, dtbse, support, development, optimiztion, fuzzy.00 dynmi, pplition, modeling, logi.005 web, prllel, ontrol, lgorithms TABLE III KWF AND THE KEYWORDS USED IN DBLP Prmeter Rnge Defult KWF.0003,.0006,.0009,.00, l, 3, 4, 5, 6 4 Rmx 9, 0,,, 3 k 50, 00, 50, 00, TABLE IV PARAMETERS FOR IMDB DATASET KWF Keywords.0003 summer, bride, gme, drem.0006 Fridy, heven, street, prty.0009 str, deth, ll, girl, lost, blood.00 ity, Amerin, blue, world.005 night, story, king, house TABLE V KWF AND THE KEYWORDS USED IN IMDB where both ends of n edge re rehble from node in W i (line 4). Note tht V i is the neighborset of W i (line 5). After the for loop, it n projet subgrph G (V,E ) of G D to nswer the given l-keyword query if Rmx R. But, it is still onsidered s lrge. Note tht in the for loop, we lso ompute the set of enters, V ( V ), where every node v V n reh t lest node v i ( W i ) whih ontins the keyword k i. Bsed on the set of enters V, smller grph G P is onstruted using the similr ides given in Algorithm 4. We omit the disussions due to the limit of spe. VII. PERFORMANCE STUDIES We implemented two polynomil dely lgorithms, Algorithm nd Algorithm 5, to find ll/top-k ommunities. We denote them s nd PDk below. We lso implemented four expnding-bsed lgorithms. Two re bsed on bottomup expnding, we denote them s nd BUk, for finding ll/top-k ommunities, respetively. Similrly, two re bsed on top-down expnding, we denote them s nd TDk. All the lgorithms were written in C++. We tested the lgorithms using two rel dtsets, DBLP (DBLP 008) ( nd IMDB ( Both re used in the reported studies to test l-keyword queries. We use the sme edge-weight funtion w e () s used in [], [7], [3], w e ((u, v)) = log ( + N in (v)), where N in (v) is the in degree of node v. For DBLP, there re 4 tbles, Author(Aid, Nme), Pper(Pid, Title, Other), Write(Aid, Pid, Remrk), Cite(Pid, Pid). The numbers of tuples of the 4 tbles re, 597K, 986K, 46K, nd K, respetively. The whole DBLP dtset onsists of 4,, 0 tuples nd 5, 076, 86 referenes. The dtbse grph onsists of 4,, 0 nodes nd 0, 53, 65 direted edges (bi-direted). The prmeters with their defult vlues re shown in Tble II. The l-keywords re seleted from the keyword sets shown in Tble III, with the ssoited KWF (keyword frequeny). For IMDB, there re 3 tbles: Users(UserID, Gender, Age, Ouption, Zip-ode), Movies(MovieID, Title, Genres), nd Rtings (UserID, MovieID, Rting, Timestmp). The numbers of tuples of the 3 tbles re, 6.04K, 3.88K nd, 000.K, respetively. The dtbse grph onsists of, 00, 3 nodes nd 4, 000, 836 direted edges, whih is denser then the DBLP grph. The prmeters used nd their defult vlues re shown in Tble IV. The keyword sets we used re shown in Tble V, with their ssoited KWF. The hrteristis for the two dtsets re different. In DBLP dtset, eh uthor writes 4.06 ppers on verge while eh pper is written by.46 uthors on verge. In IMDB dtset, eh user evlutes movies on verge while eh movie is evluted by users on verge. This ft explins why we set the defult Rmx to be 6 for DBLP nd for IMDB. All experiments were onduted on.60ghz Intel(R) Xeon(R) CPU nd GB memory PC running windows server 003. For ll lgorithms to be tested, we first projet dtbse subgrph, for n l-keyword query, using the two inverted indexes (invertedn nd invertede), nd test the lgorithms. The mximum nd verge size of the projeted grphs re.% nd 0.4% of the DBLP grph, nd.8% nd 0.5% of the IMDB grph, respetively. We signifintly redue the serh spe using the two inverted indexes. The elpsed time for onstruting inverted indexes for DBLP nd IMDB re 355 seonds nd 4 seonds, respetively. The sizes of the totl inverted indexes for DBLP nd IMDB re,6 MB nd 84 MB respetively, ompred with the sizes of the rw dtsets, 445 MB nd 4 MB. For testing lgorithms,, nd, we report the verge-dely, whih is the totl CPU time divided by the number of ommunities found, s used in [] for the purpose of testing polynomil dely lgorithms. For testing PDk, BUk, nd TDk, we report the totl CPU time for finding ll k ommunities. We lso report the mximum memory used in testing. Exp- (IMDB): We ompre with nd for finding ll ommunities ginst IMDB. Fig. 9() nd 9(b) show tht, the more frequent the keyword is, the longer the vergedely is, nd the lrger memory it onsumes. is 0 times fster thn nd lgorithms, nd onsumes the lest memory. onsumes more memory thn does,

11 CPU (ms) CPU (ms) CPU (ms) () Vry KWF () Vry l (e) Vry Rmx Fig. 9. Memory (Byte) Memory (Byte) Memory (Byte) 0M M 00K 0K K M 0M M 00K 0K K 0M M 00K 0K K Find-All (IMDB) (b) Vry KWF (d) Vry l (f) Vry Rmx CPU (ms) CPU (ms) CPU (ms) () Vry KWF () Vry l (e) Vry Rmx Fig.. Memory (Byte) Memory (Byte) Memory (Byte) 00M 0M M 00K 0K K M 0M M 00K 0K K 00M 0M M 00K 0K K Find-All (DBLP) (b) Vry KWF (d) Vry l (f) Vry Rmx CPU (se) CPU (se) TDk BUk PDk () Vry KWF TDk BUk PDk () Vry Rmx Fig. 0. CPU (se) CPU (se) Find-TopK (IMDB) TDk BUk PDk (b) Vry l TDk BUk PDk (d) Vry k beuse eh enter node is ssoited with keyword node sets, whih re the sets of keyword nodes tht n be rehed from the enter. needs to mintin ll these sets wheres n free the memory fter outputs the ommunities found. When inresing l from to 6, s shown in Fig. 9() nd 9(d), the verge-dely for ll lgorithms dereses, s expeted. is lso fster thn both nd. The memory ost inreses using nd, beuse, when l inreses, the number of resulting ommunities inreses, both nd need to mintin ll the resulting ommunities. onsumes lest memory nd does not vry muh, even when l inreses. Fig. 9(e) nd 9(f) show tht, when Rmx inreses, both the verge-dely nd the memory onsumption inreses for ll three lgorithms. performs best mong ll. We ompre PDk with BUk nd TDk for finding top-k ommunities ginst IMDB. As shown in Fig. 0(), when KWF inreses, the totl time to get the top-k ommunities inreses in most ses for ll three lgorithms, PDk performers best. Fig. 0(b) shows tht, when l inreses, the time for BUk nd TDk inreses, beuse the number of temporry results generted inreses. PDk is onsistent. In Fig. 0() nd 0(d), when Rmx or k inreses, the totl time to get the CPU (Se) TDk BUk PDk () Vry k (IMDB) (b) Vry k (DBLP) Fig.. Intertive TopK Test top-k ommunities inreses, for ll three lgorithms. PDk performs best. The memory onsumptions for ll tests re not lrge nd do not hnge muh. Due to spe limit, we do not show the memory onsumptions. As n inditor, the memory onsumption for the defult vlues of three lgorithms re KB (TDk),. KB (BUk), nd 9.6 KB (PDk). CPU (Se) TDk BUk PDk Exp- (DBLP): We ompre with nd for finding ll ommunities ginst DBLP. In Fig. (), (b), (e) nd (f), for the memory onsumption, performs best, but for the verge-dely, is slower thn both nd, beuse, in the DBLP dtset, the probbility for set of keyword nodes to be entered t multiple nodes is very smll, nd most of the results hve only one enter. In this sitution, the number of duplitions generted by nd is very smll, whih mkes them fster to enumerte ll ommunities. When KWF or Rmx inreses, the verge-dely nd memory onsumption for ll three lgorithms inreses. Fig. () shows tht, when the number of keywords l inreses, the verge-dely, for ll three lgorithms, dereses. dereses fster. In Fig. (d), when l inreses, the memory onsumption for nd inreses, beuse they hve to mintin ll the results generted, nd the number of results will inrese when l inreses. onsumes smller memory when l beomes lrger, beuse, when l inreses, the size of the projeted grph dereses for the DBLP dtset. We ompre PDk with BUk nd TDk for finding top-k ommunities ginst DBLP. They show the similr trends s they do for finding ll ommunities due to the sme resons tht the number of duplitions is smll in DBLP.

Math 32B Discussion Session Week 8 Notes February 28 and March 2, f(b) f(a) = f (t)dt (1)

Math 32B Discussion Session Week 8 Notes February 28 and March 2, f(b) f(a) = f (t)dt (1) Green s Theorem Mth 3B isussion Session Week 8 Notes Februry 8 nd Mrh, 7 Very shortly fter you lerned how to integrte single-vrible funtions, you lerned the Fundmentl Theorem of lulus the wy most integrtion

More information

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points:

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points: Eidgenössishe Tehnishe Hohshule Zürih Eole polytehnique fédérle de Zurih Politenio federle di Zurigo Federl Institute of Tehnology t Zurih Deprtement of Computer Siene. Novemer 0 Mrkus Püshel, Dvid Steurer

More information

Counting Paths Between Vertices. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs

Counting Paths Between Vertices. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs Isomorphism of Grphs Definition The simple grphs G 1 = (V 1, E 1 ) n G = (V, E ) re isomorphi if there is ijetion (n oneto-one n onto funtion) f from V 1 to V with the property tht n re jent in G 1 if

More information

Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18

Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18 Computt onl Biology Leture 18 Genome Rerrngements Finding preserved genes We hve seen before how to rerrnge genome to obtin nother one bsed on: Reversls Knowledge of preserved bloks (or genes) Now we re

More information

(a) A partition P of [a, b] is a finite subset of [a, b] containing a and b. If Q is another partition and P Q, then Q is a refinement of P.

(a) A partition P of [a, b] is a finite subset of [a, b] containing a and b. If Q is another partition and P Q, then Q is a refinement of P. Chpter 7: The Riemnn Integrl When the derivtive is introdued, it is not hrd to see tht the it of the differene quotient should be equl to the slope of the tngent line, or when the horizontl xis is time

More information

, g. Exercise 1. Generator polynomials of a convolutional code, given in binary form, are g. Solution 1.

, g. Exercise 1. Generator polynomials of a convolutional code, given in binary form, are g. Solution 1. Exerise Genertor polynomils of onvolutionl ode, given in binry form, re g, g j g. ) Sketh the enoding iruit. b) Sketh the stte digrm. ) Find the trnsfer funtion T. d) Wht is the minimum free distne of

More information

1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides.

1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides. 1 PYTHAGORAS THEOREM 1 1 Pythgors Theorem In this setion we will present geometri proof of the fmous theorem of Pythgors. Given right ngled tringle, the squre of the hypotenuse is equl to the sum of the

More information

Lecture 1 - Introduction and Basic Facts about PDEs

Lecture 1 - Introduction and Basic Facts about PDEs * 18.15 - Introdution to PDEs, Fll 004 Prof. Gigliol Stffilni Leture 1 - Introdution nd Bsi Fts bout PDEs The Content of the Course Definition of Prtil Differentil Eqution (PDE) Liner PDEs VVVVVVVVVVVVVVVVVVVV

More information

Electromagnetism Notes, NYU Spring 2018

Electromagnetism Notes, NYU Spring 2018 Eletromgnetism Notes, NYU Spring 208 April 2, 208 Ation formultion of EM. Free field desription Let us first onsider the free EM field, i.e. in the bsene of ny hrges or urrents. To tret this s mehnil system

More information

Distance-Join: Pattern Match Query In a Large Graph Database

Distance-Join: Pattern Match Query In a Large Graph Database Distne-Join: Pttern Mth Query In Lrge Grph Dtbse Lei Zou Huzhong University of Siene nd Tehnology Wuhn, Chin zoulei@mil.hust.edu.n Lei Chen Hong Kong University of Siene nd Tehnology Hong Kong leihen@se.ust.hk

More information

A Lower Bound for the Length of a Partial Transversal in a Latin Square, Revised Version

A Lower Bound for the Length of a Partial Transversal in a Latin Square, Revised Version A Lower Bound for the Length of Prtil Trnsversl in Ltin Squre, Revised Version Pooy Htmi nd Peter W. Shor Deprtment of Mthemtil Sienes, Shrif University of Tehnology, P.O.Bo 11365-9415, Tehrn, Irn Deprtment

More information

22: Union Find. CS 473u - Algorithms - Spring April 14, We want to maintain a collection of sets, under the operations of:

22: Union Find. CS 473u - Algorithms - Spring April 14, We want to maintain a collection of sets, under the operations of: 22: Union Fin CS 473u - Algorithms - Spring 2005 April 14, 2005 1 Union-Fin We wnt to mintin olletion of sets, uner the opertions of: 1. MkeSet(x) - rete set tht ontins the single element x. 2. Fin(x)

More information

INTEGRATION. 1 Integrals of Complex Valued functions of a REAL variable

INTEGRATION. 1 Integrals of Complex Valued functions of a REAL variable INTEGRATION NOTE: These notes re supposed to supplement Chpter 4 of the online textbook. 1 Integrls of Complex Vlued funtions of REAL vrible If I is n intervl in R (for exmple I = [, b] or I = (, b)) nd

More information

Chapter 4 State-Space Planning

Chapter 4 State-Space Planning Leture slides for Automted Plnning: Theory nd Prtie Chpter 4 Stte-Spe Plnning Dn S. Nu CMSC 722, AI Plnning University of Mrylnd, Spring 2008 1 Motivtion Nerly ll plnning proedures re serh proedures Different

More information

Project 6: Minigoals Towards Simplifying and Rewriting Expressions

Project 6: Minigoals Towards Simplifying and Rewriting Expressions MAT 51 Wldis Projet 6: Minigols Towrds Simplifying nd Rewriting Expressions The distriutive property nd like terms You hve proly lerned in previous lsses out dding like terms ut one prolem with the wy

More information

Tutorial Worksheet. 1. Find all solutions to the linear system by following the given steps. x + 2y + 3z = 2 2x + 3y + z = 4.

Tutorial Worksheet. 1. Find all solutions to the linear system by following the given steps. x + 2y + 3z = 2 2x + 3y + z = 4. Mth 5 Tutoril Week 1 - Jnury 1 1 Nme Setion Tutoril Worksheet 1. Find ll solutions to the liner system by following the given steps x + y + z = x + y + z = 4. y + z = Step 1. Write down the rgumented mtrix

More information

Green s Theorem. (2x e y ) da. (2x e y ) dx dy. x 2 xe y. (1 e y ) dy. y=1. = y e y. y=0. = 2 e

Green s Theorem. (2x e y ) da. (2x e y ) dx dy. x 2 xe y. (1 e y ) dy. y=1. = y e y. y=0. = 2 e Green s Theorem. Let be the boundry of the unit squre, y, oriented ounterlokwise, nd let F be the vetor field F, y e y +, 2 y. Find F d r. Solution. Let s write P, y e y + nd Q, y 2 y, so tht F P, Q. Let

More information

Lecture Notes No. 10

Lecture Notes No. 10 2.6 System Identifition, Estimtion, nd Lerning Leture otes o. Mrh 3, 26 6 Model Struture of Liner ime Invrint Systems 6. Model Struture In representing dynmil system, the first step is to find n pproprite

More information

Section 3.6. Definite Integrals

Section 3.6. Definite Integrals The Clulus of Funtions of Severl Vribles Setion.6 efinite Integrls We will first define the definite integrl for funtion f : R R nd lter indite how the definition my be extended to funtions of three or

More information

Arrow s Impossibility Theorem

Arrow s Impossibility Theorem Rep Fun Gme Properties Arrow s Theorem Arrow s Impossiility Theorem Leture 12 Arrow s Impossiility Theorem Leture 12, Slide 1 Rep Fun Gme Properties Arrow s Theorem Leture Overview 1 Rep 2 Fun Gme 3 Properties

More information

Activities. 4.1 Pythagoras' Theorem 4.2 Spirals 4.3 Clinometers 4.4 Radar 4.5 Posting Parcels 4.6 Interlocking Pipes 4.7 Sine Rule Notes and Solutions

Activities. 4.1 Pythagoras' Theorem 4.2 Spirals 4.3 Clinometers 4.4 Radar 4.5 Posting Parcels 4.6 Interlocking Pipes 4.7 Sine Rule Notes and Solutions MEP: Demonstrtion Projet UNIT 4: Trigonometry UNIT 4 Trigonometry tivities tivities 4. Pythgors' Theorem 4.2 Spirls 4.3 linometers 4.4 Rdr 4.5 Posting Prels 4.6 Interloking Pipes 4.7 Sine Rule Notes nd

More information

Line Integrals and Entire Functions

Line Integrals and Entire Functions Line Integrls nd Entire Funtions Defining n Integrl for omplex Vlued Funtions In the following setions, our min gol is to show tht every entire funtion n be represented s n everywhere onvergent power series

More information

Discrete Structures Lecture 11

Discrete Structures Lecture 11 Introdution Good morning. In this setion we study funtions. A funtion is mpping from one set to nother set or, perhps, from one set to itself. We study the properties of funtions. A mpping my not e funtion.

More information

Chem Homework 11 due Monday, Apr. 28, 2014, 2 PM

Chem Homework 11 due Monday, Apr. 28, 2014, 2 PM Chem 44 - Homework due ondy, pr. 8, 4, P.. . Put this in eq 8.4 terms: E m = m h /m e L for L=d The degenery in the ring system nd the inresed sping per level (4x bigger) mkes the sping between the HOO

More information

Core 2 Logarithms and exponentials. Section 1: Introduction to logarithms

Core 2 Logarithms and exponentials. Section 1: Introduction to logarithms Core Logrithms nd eponentils Setion : Introdution to logrithms Notes nd Emples These notes ontin subsetions on Indies nd logrithms The lws of logrithms Eponentil funtions This is n emple resoure from MEI

More information

Arrow s Impossibility Theorem

Arrow s Impossibility Theorem Rep Voting Prdoxes Properties Arrow s Theorem Arrow s Impossiility Theorem Leture 12 Arrow s Impossiility Theorem Leture 12, Slide 1 Rep Voting Prdoxes Properties Arrow s Theorem Leture Overview 1 Rep

More information

Metodologie di progetto HW Technology Mapping. Last update: 19/03/09

Metodologie di progetto HW Technology Mapping. Last update: 19/03/09 Metodologie di progetto HW Tehnology Mpping Lst updte: 19/03/09 Tehnology Mpping 2 Tehnology Mpping Exmple: t 1 = + b; t 2 = d + e; t 3 = b + d; t 4 = t 1 t 2 + fg; t 5 = t 4 h + t 2 t 3 ; F = t 5 ; t

More information

Logic Synthesis and Verification

Logic Synthesis and Verification Logi Synthesis nd Verifition SOPs nd Inompletely Speified Funtions Jie-Hong Rolnd Jing 江介宏 Deprtment of Eletril Engineering Ntionl Tiwn University Fll 2010 Reding: Logi Synthesis in Nutshell Setion 2 most

More information

8 THREE PHASE A.C. CIRCUITS

8 THREE PHASE A.C. CIRCUITS 8 THREE PHSE.. IRUITS The signls in hpter 7 were sinusoidl lternting voltges nd urrents of the so-lled single se type. n emf of suh type n e esily generted y rotting single loop of ondutor (or single winding),

More information

6.5 Improper integrals

6.5 Improper integrals Eerpt from "Clulus" 3 AoPS In. www.rtofprolemsolving.om 6.5. IMPROPER INTEGRALS 6.5 Improper integrls As we ve seen, we use the definite integrl R f to ompute the re of the region under the grph of y =

More information

Maintaining Mathematical Proficiency

Maintaining Mathematical Proficiency Nme Dte hpter 9 Mintining Mthemtil Profiieny Simplify the epression. 1. 500. 189 3. 5 4. 4 3 5. 11 5 6. 8 Solve the proportion. 9 3 14 7. = 8. = 9. 1 7 5 4 = 4 10. 0 6 = 11. 7 4 10 = 1. 5 9 15 3 = 5 +

More information

Part 4. Integration (with Proofs)

Part 4. Integration (with Proofs) Prt 4. Integrtion (with Proofs) 4.1 Definition Definition A prtition P of [, b] is finite set of points {x 0, x 1,..., x n } with = x 0 < x 1

More information

Introduction to Olympiad Inequalities

Introduction to Olympiad Inequalities Introdution to Olympid Inequlities Edutionl Studies Progrm HSSP Msshusetts Institute of Tehnology Snj Simonovikj Spring 207 Contents Wrm up nd Am-Gm inequlity 2. Elementry inequlities......................

More information

Infinite Geometric Series

Infinite Geometric Series Infinite Geometric Series Finite Geometric Series ( finite SUM) Let 0 < r < 1, nd let n be positive integer. Consider the finite sum It turns out there is simple lgebric expression tht is equivlent to

More information

PAIR OF LINEAR EQUATIONS IN TWO VARIABLES

PAIR OF LINEAR EQUATIONS IN TWO VARIABLES PAIR OF LINEAR EQUATIONS IN TWO VARIABLES. Two liner equtions in the sme two vriles re lled pir of liner equtions in two vriles. The most generl form of pir of liner equtions is x + y + 0 x + y + 0 where,,,,,,

More information

T b a(f) [f ] +. P b a(f) = Conclude that if f is in AC then it is the difference of two monotone absolutely continuous functions.

T b a(f) [f ] +. P b a(f) = Conclude that if f is in AC then it is the difference of two monotone absolutely continuous functions. Rel Vribles, Fll 2014 Problem set 5 Solution suggestions Exerise 1. Let f be bsolutely ontinuous on [, b] Show tht nd T b (f) P b (f) f (x) dx [f ] +. Conlude tht if f is in AC then it is the differene

More information

(h+ ) = 0, (3.1) s = s 0, (3.2)

(h+ ) = 0, (3.1) s = s 0, (3.2) Chpter 3 Nozzle Flow Qusistedy idel gs flow in pipes For the lrge vlues of the Reynolds number typilly found in nozzles, the flow is idel. For stedy opertion with negligible body fores the energy nd momentum

More information

Comparing the Pre-image and Image of a Dilation

Comparing the Pre-image and Image of a Dilation hpter Summry Key Terms Postultes nd Theorems similr tringles (.1) inluded ngle (.2) inluded side (.2) geometri men (.) indiret mesurement (.6) ngle-ngle Similrity Theorem (.2) Side-Side-Side Similrity

More information

April 8, 2017 Math 9. Geometry. Solving vector problems. Problem. Prove that if vectors and satisfy, then.

April 8, 2017 Math 9. Geometry. Solving vector problems. Problem. Prove that if vectors and satisfy, then. pril 8, 2017 Mth 9 Geometry Solving vetor prolems Prolem Prove tht if vetors nd stisfy, then Solution 1 onsider the vetor ddition prllelogrm shown in the Figure Sine its digonls hve equl length,, the prllelogrm

More information

18.06 Problem Set 4 Due Wednesday, Oct. 11, 2006 at 4:00 p.m. in 2-106

18.06 Problem Set 4 Due Wednesday, Oct. 11, 2006 at 4:00 p.m. in 2-106 8. Problem Set Due Wenesy, Ot., t : p.m. in - Problem Mony / Consier the eight vetors 5, 5, 5,..., () List ll of the one-element, linerly epenent sets forme from these. (b) Wht re the two-element, linerly

More information

Chapter 0. What is the Lebesgue integral about?

Chapter 0. What is the Lebesgue integral about? Chpter 0. Wht is the Lebesgue integrl bout? The pln is to hve tutoril sheet ech week, most often on Fridy, (to be done during the clss) where you will try to get used to the ides introduced in the previous

More information

f (x)dx = f(b) f(a). a b f (x)dx is the limit of sums

f (x)dx = f(b) f(a). a b f (x)dx is the limit of sums Green s Theorem If f is funtion of one vrible x with derivtive f x) or df dx to the Fundmentl Theorem of lulus, nd [, b] is given intervl then, ording This is not trivil result, onsidering tht b b f x)dx

More information

Section 1.3 Triangles

Section 1.3 Triangles Se 1.3 Tringles 21 Setion 1.3 Tringles LELING TRINGLE The line segments tht form tringle re lled the sides of the tringle. Eh pir of sides forms n ngle, lled n interior ngle, nd eh tringle hs three interior

More information

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn

More information

7.2 The Definite Integral

7.2 The Definite Integral 7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where

More information

NON-DETERMINISTIC FSA

NON-DETERMINISTIC FSA Tw o types of non-determinism: NON-DETERMINISTIC FS () Multiple strt-sttes; strt-sttes S Q. The lnguge L(M) ={x:x tkes M from some strt-stte to some finl-stte nd ll of x is proessed}. The string x = is

More information

Table of Content. c 1 / 5

Table of Content. c 1 / 5 Tehnil Informtion - t nd t Temperture for Controlger 03-2018 en Tble of Content Introdution....................................................................... 2 Definitions for t nd t..............................................................

More information

Finite State Automata and Determinisation

Finite State Automata and Determinisation Finite Stte Automt nd Deterministion Tim Dworn Jnury, 2016 Lnguges fs nf re df Deterministion 2 Outline 1 Lnguges 2 Finite Stte Automt (fs) 3 Non-deterministi Finite Stte Automt (nf) 4 Regulr Expressions

More information

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007 A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H Thoms Shores Deprtment of Mthemtics University of Nebrsk Spring 2007 Contents Rtes of Chnge nd Derivtives 1 Dierentils 4 Are nd Integrls 5 Multivrite Clculus

More information

Common intervals of genomes. Mathieu Raffinot CNRS LIAFA

Common intervals of genomes. Mathieu Raffinot CNRS LIAFA Common intervls of genomes Mthieu Rffinot CNRS LIF Context: omprtive genomis. set of genomes prtilly/totlly nnotte Informtive group of genes or omins? Ex: COG tse Mny iffiulties! iology Wht re two similr

More information

MA10207B: ANALYSIS SECOND SEMESTER OUTLINE NOTES

MA10207B: ANALYSIS SECOND SEMESTER OUTLINE NOTES MA10207B: ANALYSIS SECOND SEMESTER OUTLINE NOTES CHARLIE COLLIER UNIVERSITY OF BATH These notes hve been typeset by Chrlie Collier nd re bsed on the leture notes by Adrin Hill nd Thoms Cottrell. These

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design nd Anlysis LECTURE 8 Mx. lteness ont d Optiml Ching Adm Smith 9/12/2008 A. Smith; sed on slides y E. Demine, C. Leiserson, S. Rskhodnikov, K. Wyne Sheduling to Minimizing Lteness Minimizing

More information

( ) 1. 1) Let f( x ) = 10 5x. Find and simplify f( 2) and then state the domain of f(x).

( ) 1. 1) Let f( x ) = 10 5x. Find and simplify f( 2) and then state the domain of f(x). Mth 15 Fettermn/DeSmet Gustfson/Finl Em Review 1) Let f( ) = 10 5. Find nd simplif f( ) nd then stte the domin of f(). ) Let f( ) = +. Find nd simplif f(1) nd then stte the domin of f(). ) Let f( ) = 8.

More information

The Double Integral. The Riemann sum of a function f (x; y) over this partition of [a; b] [c; d] is. f (r j ; t k ) x j y k

The Double Integral. The Riemann sum of a function f (x; y) over this partition of [a; b] [c; d] is. f (r j ; t k ) x j y k The Double Integrl De nition of the Integrl Iterted integrls re used primrily s tool for omputing double integrls, where double integrl is n integrl of f (; y) over region : In this setion, we de ne double

More information

Generalization of 2-Corner Frequency Source Models Used in SMSIM

Generalization of 2-Corner Frequency Source Models Used in SMSIM Generliztion o 2-Corner Frequeny Soure Models Used in SMSIM Dvid M. Boore 26 Mrh 213, orreted Figure 1 nd 2 legends on 5 April 213, dditionl smll orretions on 29 My 213 Mny o the soure spetr models ville

More information

Technische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution

Technische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution Tehnishe Universität Münhen Winter term 29/ I7 Prof. J. Esprz / J. Křetínský / M. Luttenerger. Ferur 2 Solution Automt nd Forml Lnguges Homework 2 Due 5..29. Exerise 2. Let A e the following finite utomton:

More information

NUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by.

NUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by. NUMERICAL INTEGRATION 1 Introduction The inverse process to differentition in clculus is integrtion. Mthemticlly, integrtion is represented by f(x) dx which stnds for the integrl of the function f(x) with

More information

RELATIONAL MODEL.

RELATIONAL MODEL. RELATIONAL MODEL Structure of Reltionl Dtbses Reltionl Algebr Tuple Reltionl Clculus Domin Reltionl Clculus Extended Reltionl-Algebr- Opertions Modifiction of the Dtbse Views EXAMPLE OF A RELATION BASIC

More information

PYTHAGORAS THEOREM WHAT S IN CHAPTER 1? IN THIS CHAPTER YOU WILL:

PYTHAGORAS THEOREM WHAT S IN CHAPTER 1? IN THIS CHAPTER YOU WILL: PYTHAGORAS THEOREM 1 WHAT S IN CHAPTER 1? 1 01 Squres, squre roots nd surds 1 02 Pythgors theorem 1 03 Finding the hypotenuse 1 04 Finding shorter side 1 05 Mixed prolems 1 06 Testing for right-ngled tringles

More information

Matrices SCHOOL OF ENGINEERING & BUILT ENVIRONMENT. Mathematics (c) 1. Definition of a Matrix

Matrices SCHOOL OF ENGINEERING & BUILT ENVIRONMENT. Mathematics (c) 1. Definition of a Matrix tries Definition of tri mtri is regulr rry of numers enlosed inside rkets SCHOOL OF ENGINEERING & UIL ENVIRONEN Emple he following re ll mtries: ), ) 9, themtis ), d) tries Definition of tri Size of tri

More information

Engr354: Digital Logic Circuits

Engr354: Digital Logic Circuits Engr354: Digitl Logi Ciruits Chpter 4: Logi Optimiztion Curtis Nelson Logi Optimiztion In hpter 4 you will lern out: Synthesis of logi funtions; Anlysis of logi iruits; Tehniques for deriving minimum-ost

More information

Math Lecture 23

Math Lecture 23 Mth 8 - Lecture 3 Dyln Zwick Fll 3 In our lst lecture we delt with solutions to the system: x = Ax where A is n n n mtrix with n distinct eigenvlues. As promised, tody we will del with the question of

More information

Intermediate Math Circles Wednesday 17 October 2012 Geometry II: Side Lengths

Intermediate Math Circles Wednesday 17 October 2012 Geometry II: Side Lengths Intermedite Mth Cirles Wednesdy 17 Otoer 01 Geometry II: Side Lengths Lst week we disussed vrious ngle properties. As we progressed through the evening, we proved mny results. This week, we will look t

More information

Fast Frequent Free Tree Mining in Graph Databases

Fast Frequent Free Tree Mining in Graph Databases The Chinese University of Hong Kong Fst Frequent Free Tree Mining in Grph Dtses Peixing Zho Jeffrey Xu Yu The Chinese University of Hong Kong Decemer 18 th, 2006 ICDM Workshop MCD06 Synopsis Introduction

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design nd Anlysis LECTURE 5 Supplement Greedy Algorithms Cont d Minimizing lteness Ching (NOT overed in leture) Adm Smith 9/8/10 A. Smith; sed on slides y E. Demine, C. Leiserson, S. Rskhodnikov,

More information

Unit #9 : Definite Integral Properties; Fundamental Theorem of Calculus

Unit #9 : Definite Integral Properties; Fundamental Theorem of Calculus Unit #9 : Definite Integrl Properties; Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl

More information

ANALYSIS AND MODELLING OF RAINFALL EVENTS

ANALYSIS AND MODELLING OF RAINFALL EVENTS Proeedings of the 14 th Interntionl Conferene on Environmentl Siene nd Tehnology Athens, Greee, 3-5 Septemer 215 ANALYSIS AND MODELLING OF RAINFALL EVENTS IOANNIDIS K., KARAGRIGORIOU A. nd LEKKAS D.F.

More information

Hyers-Ulam stability of Pielou logistic difference equation

Hyers-Ulam stability of Pielou logistic difference equation vilble online t wwwisr-publitionsom/jns J Nonliner Si ppl, 0 (207, 35 322 Reserh rtile Journl Homepge: wwwtjnsom - wwwisr-publitionsom/jns Hyers-Ulm stbility of Pielou logisti differene eqution Soon-Mo

More information

1 Online Learning and Regret Minimization

1 Online Learning and Regret Minimization 2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in

More information

Neighborhood Based Fast Graph Search in Large Networks

Neighborhood Based Fast Graph Search in Large Networks Neighborhood Bsed Fst Grph Serh in Lrge Networks Arijit Khn Dept. of Computer Siene University of Cliforni Snt Brbr, CA 9306 rijitkhn@s.usb.edu Ziyu Gun Dept. of Computer Siene University of Cliforni Snt

More information

CS 573 Automata Theory and Formal Languages

CS 573 Automata Theory and Formal Languages Non-determinism Automt Theory nd Forml Lnguges Professor Leslie Lnder Leture # 3 Septemer 6, 2 To hieve our gol, we need the onept of Non-deterministi Finite Automton with -moves (NFA) An NFA is tuple

More information

MATH Final Review

MATH Final Review MATH 1591 - Finl Review November 20, 2005 1 Evlution of Limits 1. the ε δ definition of limit. 2. properties of limits. 3. how to use the diret substitution to find limit. 4. how to use the dividing out

More information

19 Optimal behavior: Game theory

19 Optimal behavior: Game theory Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,

More information

Section 4.4. Green s Theorem

Section 4.4. Green s Theorem The Clulus of Funtions of Severl Vriles Setion 4.4 Green s Theorem Green s theorem is n exmple from fmily of theorems whih onnet line integrls (nd their higher-dimensionl nlogues) with the definite integrls

More information

THE INFLUENCE OF MODEL RESOLUTION ON AN EXPRESSION OF THE ATMOSPHERIC BOUNDARY LAYER IN A SINGLE-COLUMN MODEL

THE INFLUENCE OF MODEL RESOLUTION ON AN EXPRESSION OF THE ATMOSPHERIC BOUNDARY LAYER IN A SINGLE-COLUMN MODEL THE INFLUENCE OF MODEL RESOLUTION ON AN EXPRESSION OF THE ATMOSPHERIC BOUNDARY LAYER IN A SINGLE-COLUMN MODEL P3.1 Kot Iwmur*, Hiroto Kitgw Jpn Meteorologil Ageny 1. INTRODUCTION Jpn Meteorologil Ageny

More information

Solutions for HW9. Bipartite: put the red vertices in V 1 and the black in V 2. Not bipartite!

Solutions for HW9. Bipartite: put the red vertices in V 1 and the black in V 2. Not bipartite! Solutions for HW9 Exerise 28. () Drw C 6, W 6 K 6, n K 5,3. C 6 : W 6 : K 6 : K 5,3 : () Whih of the following re iprtite? Justify your nswer. Biprtite: put the re verties in V 1 n the lk in V 2. Biprtite:

More information

Part I: Study the theorem statement.

Part I: Study the theorem statement. Nme 1 Nme 2 Nme 3 A STUDY OF PYTHAGORAS THEOREM Instrutions: Together in groups of 2 or 3, fill out the following worksheet. You my lift nswers from the reding, or nswer on your own. Turn in one pket for

More information

20 MATHEMATICS POLYNOMIALS

20 MATHEMATICS POLYNOMIALS 0 MATHEMATICS POLYNOMIALS.1 Introduction In Clss IX, you hve studied polynomils in one vrible nd their degrees. Recll tht if p(x) is polynomil in x, the highest power of x in p(x) is clled the degree of

More information

Logic Synthesis and Verification

Logic Synthesis and Verification Logi Synthesis nd Verifition SOPs nd Inompletely Speified Funtions Jie-Hong Rolnd Jing 江介宏 Deprtment of Eletril Engineering Ntionl Tiwn University Fll 22 Reding: Logi Synthesis in Nutshell Setion 2 most

More information

Lecture 6. CMOS Static & Dynamic Logic Gates. Static CMOS Circuit. PMOS Transistors in Series/Parallel Connection

Lecture 6. CMOS Static & Dynamic Logic Gates. Static CMOS Circuit. PMOS Transistors in Series/Parallel Connection NMOS Trnsistors in Series/Prllel onnetion Leture 6 MOS Stti & ynmi Logi Gtes Trnsistors n e thought s swith ontrolled y its gte signl NMOS swith loses when swith ontrol input is high Peter heung eprtment

More information

] dx (3) = [15x] 2 0

] dx (3) = [15x] 2 0 Leture 6. Double Integrls nd Volume on etngle Welome to Cl IV!!!! These notes re designed to be redble nd desribe the w I will eplin the mteril in lss. Hopefull the re thorough, but it s good ide to hve

More information

Math Calculus with Analytic Geometry II

Math Calculus with Analytic Geometry II orem of definite Mth 5.0 with Anlytic Geometry II Jnury 4, 0 orem of definite If < b then b f (x) dx = ( under f bove x-xis) ( bove f under x-xis) Exmple 8 0 3 9 x dx = π 3 4 = 9π 4 orem of definite Problem

More information

Laboratory for Foundations of Computer Science. An Unfolding Approach. University of Edinburgh. Model Checking. Javier Esparza

Laboratory for Foundations of Computer Science. An Unfolding Approach. University of Edinburgh. Model Checking. Javier Esparza An Unfoling Approh to Moel Cheking Jvier Esprz Lbortory for Fountions of Computer Siene University of Einburgh Conurrent progrms Progrm: tuple P T 1 T n of finite lbelle trnsition systems T i A i S i i

More information

A Mathematical Model for Unemployment-Taking an Action without Delay

A Mathematical Model for Unemployment-Taking an Action without Delay Advnes in Dynmil Systems nd Applitions. ISSN 973-53 Volume Number (7) pp. -8 Reserh Indi Publitions http://www.ripublition.om A Mthemtil Model for Unemployment-Tking n Ation without Dely Gulbnu Pthn Diretorte

More information

University of Sioux Falls. MAT204/205 Calculus I/II

University of Sioux Falls. MAT204/205 Calculus I/II University of Sioux Flls MAT204/205 Clulus I/II Conepts ddressed: Clulus Textook: Thoms Clulus, 11 th ed., Weir, Hss, Giordno 1. Use stndrd differentition nd integrtion tehniques. Differentition tehniques

More information

where the box contains a finite number of gates from the given collection. Examples of gates that are commonly used are the following: a b

where the box contains a finite number of gates from the given collection. Examples of gates that are commonly used are the following: a b CS 294-2 9/11/04 Quntum Ciruit Model, Solovy-Kitev Theorem, BQP Fll 2004 Leture 4 1 Quntum Ciruit Model 1.1 Clssil Ciruits - Universl Gte Sets A lssil iruit implements multi-output oolen funtion f : {0,1}

More information

Bisimulation, Games & Hennessy Milner logic

Bisimulation, Games & Hennessy Milner logic Bisimultion, Gmes & Hennessy Milner logi Leture 1 of Modelli Mtemtii dei Proessi Conorrenti Pweł Soboiński Univeristy of Southmpton, UK Bisimultion, Gmes & Hennessy Milner logi p.1/32 Clssil lnguge theory

More information

Learning Partially Observable Markov Models from First Passage Times

Learning Partially Observable Markov Models from First Passage Times Lerning Prtilly Oservle Mrkov s from First Pssge s Jérôme Cllut nd Pierre Dupont Europen Conferene on Mhine Lerning (ECML) 8 Septemer 7 Outline. FPT in models nd sequenes. Prtilly Oservle Mrkov s (POMMs).

More information

System Validation (IN4387) November 2, 2012, 14:00-17:00

System Validation (IN4387) November 2, 2012, 14:00-17:00 System Vlidtion (IN4387) Novemer 2, 2012, 14:00-17:00 Importnt Notes. The exmintion omprises 5 question in 4 pges. Give omplete explntion nd do not onfine yourself to giving the finl nswer. Good luk! Exerise

More information

Lecture 6: Coding theory

Lecture 6: Coding theory Leture 6: Coing theory Biology 429 Crl Bergstrom Ferury 4, 2008 Soures: This leture loosely follows Cover n Thoms Chpter 5 n Yeung Chpter 3. As usul, some of the text n equtions re tken iretly from those

More information

Student Activity 3: Single Factor ANOVA

Student Activity 3: Single Factor ANOVA MATH 40 Student Activity 3: Single Fctor ANOVA Some Bsic Concepts In designed experiment, two or more tretments, or combintions of tretments, is pplied to experimentl units The number of tretments, whether

More information

SUMMER KNOWHOW STUDY AND LEARNING CENTRE

SUMMER KNOWHOW STUDY AND LEARNING CENTRE SUMMER KNOWHOW STUDY AND LEARNING CENTRE Indices & Logrithms 2 Contents Indices.2 Frctionl Indices.4 Logrithms 6 Exponentil equtions. Simplifying Surds 13 Opertions on Surds..16 Scientific Nottion..18

More information

CS 491G Combinatorial Optimization Lecture Notes

CS 491G Combinatorial Optimization Lecture Notes CS 491G Comintoril Optimiztion Leture Notes Dvi Owen July 30, August 1 1 Mthings Figure 1: two possile mthings in simple grph. Definition 1 Given grph G = V, E, mthing is olletion of eges M suh tht e i,

More information

Solutions to Assignment 1

Solutions to Assignment 1 MTHE 237 Fll 2015 Solutions to Assignment 1 Problem 1 Find the order of the differentil eqution: t d3 y dt 3 +t2 y = os(t. Is the differentil eqution liner? Is the eqution homogeneous? b Repet the bove

More information

Co-ordinated s-convex Function in the First Sense with Some Hadamard-Type Inequalities

Co-ordinated s-convex Function in the First Sense with Some Hadamard-Type Inequalities Int. J. Contemp. Mth. Sienes, Vol. 3, 008, no. 3, 557-567 Co-ordinted s-convex Funtion in the First Sense with Some Hdmrd-Type Inequlities Mohmmd Alomri nd Mslin Drus Shool o Mthemtil Sienes Fulty o Siene

More information

NEW CIRCUITS OF HIGH-VOLTAGE PULSE GENERATORS WITH INDUCTIVE-CAPACITIVE ENERGY STORAGE

NEW CIRCUITS OF HIGH-VOLTAGE PULSE GENERATORS WITH INDUCTIVE-CAPACITIVE ENERGY STORAGE NEW CIRCUITS OF HIGH-VOLTAGE PULSE GENERATORS WITH INDUCTIVE-CAPACITIVE ENERGY STORAGE V.S. Gordeev, G.A. Myskov Russin Federl Nuler Center All-Russi Sientifi Reserh Institute of Experimentl Physis (RFNC-VNIIEF)

More information

ARITHMETIC OPERATIONS. The real numbers have the following properties: a b c ab ac

ARITHMETIC OPERATIONS. The real numbers have the following properties: a b c ab ac REVIEW OF ALGEBRA Here we review the bsic rules nd procedures of lgebr tht you need to know in order to be successful in clculus. ARITHMETIC OPERATIONS The rel numbers hve the following properties: b b

More information

12.4 Similarity in Right Triangles

12.4 Similarity in Right Triangles Nme lss Dte 12.4 Similrit in Right Tringles Essentil Question: How does the ltitude to the hpotenuse of right tringle help ou use similr right tringles to solve prolems? Eplore Identifing Similrit in Right

More information

Sections 5.3: Antiderivatives and the Fundamental Theorem of Calculus Theory:

Sections 5.3: Antiderivatives and the Fundamental Theorem of Calculus Theory: Setions 5.3: Antierivtives n the Funmentl Theorem of Clulus Theory: Definition. Assume tht y = f(x) is ontinuous funtion on n intervl I. We ll funtion F (x), x I, to be n ntierivtive of f(x) if F (x) =

More information