Incremental Maintenance of XML Structural Indexes

Size: px
Start display at page:

Download "Incremental Maintenance of XML Structural Indexes"

Transcription

1 Inementl Mintenne of XML Stutul Indexes Ke Yi Ho He Ion Stnoi Jun Yng Dept. Compute Siene Duke Univesity Dept. Compute Siene Duke Univesity IBMT.J.Wtson Reseh Cente Dept. Compute Siene Duke Univesity ABSTRACT Inesing populity of XML in eent yes hs geneted muh inteest in quey poessing ove gph-stutued dt. To suppot effiient evlution of pth expessions, mny stutul indexes hve been poposed. The most popul ones e the 1-index, bsed on the notion of gph bisimility, nd the eently poposed A(k)-index, bsed on the notion of lol simility to povide tde-off between index size nd quey nsweing powe. Fo these indexes to be ptil, we need effetive nd effiient inementl mintenne lgoithms to keep them onsistent with the undelying dt. Howeve, existing updte lgoithms fo stutul indexes essentilly povide no guntees on the qulity of the index; the updted index is usully lge size thn neessy, degding the pefomne fo subsequent queies. In this ppe, we popose updte lgoithms fo the 1- index nd the A(k)-index with povble guntees on the esulting index qulity. Ou lgoithms lwys mintin miniml index, i.e., meging ny two index nodes would esult in n inoet index. Fo the 1-index, if the dt gph is yli, ou lgoithm futhe ensues tht the index is minimum, i.e., it hs the lest numbe of index nodes possible. Fo the A(k)-index, we show tht the miniml index ou lgoithm mintins is lso the unique minimum A(k)-index, fo both yli nd yli dt gphs. Finlly, though expeimentl evlution, we demonstte tht ou lgoithms bing signifint impovement ove pevious methods, in tems of both index size nd updte time. 1. INTRODUCTION Inesing populity of XML in eent yes hs gene- Pt of the wok ws done while the utho ws visiting IBM T. J. Wtson Reseh Cente. Suppoted in pt by the Ntionl Siene Foundtion though CAREER gnt CCR nd ITR gnt EIA Suppoted by Ntionl Siene Foundtion CAREER Awd unde gnt IIS Pemission to mke digitl o hd opies of ll o pt of this wok fo pesonl o lssoom use is gnted without fee povided tht opies e not mde o distibuted fo pofit o ommeil dvntge, nd tht opies be this notie nd the full ittion on the fist pge. To opy othewise, to epublish, to post on seves o to edistibute to lists, equies pio speifi pemission nd/o fee. SIGMOD 2004 June 13-18, 2004, Pis, Fne. Copyight 2004 ACM /04/06... $5.00. ted muh inteest in quey poessing ove gph-stutued dt. A numbe of ommeil dtbse vendos e mking signifint effots to suppot XML ntively, the thn onvet it to the tditionl eltionl model. One of the mjo hllenges of this tsk is to povide suppot fo effiient quey poessing ove XML. To summize the stutue of suh dt nd to suppot pth expession [4] evlution, novel stutul indexes hve been poposed [11, 9, 17, 7]. Among the most popul ones e the 1-index [11], bsed on the notion of gph bisimility, nd the eently poposed A(k)-index [9], bsed on the notion of lol simility to povide tde-off between index size nd quey nsweing powe. Some stutul indexes hve lso been used s sttistil synopses fo estimting seletivities of pth expessions [3, 16]. Comped with tditionl eltionl indexes, muh less ttention hs been dieted to the poblem of mintining stutul indexes fo XML, with the exeption of eent wok in [8]. Afte n XML doument is updted, its stutul index must be popely mintined so tht subsequent queies hve view of the summized dt tht is onsistent with the updted doument. Fo stutul indexes to be ptil, we need effiient index mintenne lgoithms tht guntee the uy nd effiieny of these indexes fo queying. Thee e two bsi ppohes to index mintenne: eonstution nd inementl mintenne. Reonstution is simple nd usully leds to highqulity indexes, but the ovehed of eonstution mkes it untttive even fo dtbses with modete updte tes. The seond ppoh, inementl mintenne, updtes the existing index inementlly s soon s the undelying dtbse hnges. The ost of omputing nd pplying inementl index updtes n be potentilly muh lowe thn tht of eonstution. Designing good mintenne lgoithms is hllenging beuse of the delite blne between effiy nd effiieny. Effiy mens peseving the qulity of the stutul summy. Fo the sme undelying dt, thee e mny oet stutul summies, but they vy getly in size nd hene in quey pefomne. The lgoithm should ensue tht the updted index is not expnded unneessily. On the othe hnd, effiieny hee mens tht the lgoithm itself should be effiient. In some ses, obtining the smllest possible stutul summy is vey expensive, so settling on esonbly smll stutul summy would be moe pefeble. Finding the ight blne in this tdeoff is not tivil. Reonstution povides pefet effiy, but seveely lks effiieny. In ontst, lthough the pe-

2 viously poposed updte lgoithms in [9] e effiient, we show tht thei effiy is lking: the updted index usully hs muh lge size thn neessy, degding the pefomne fo subsequent quey evlutions. In this ppe, we demonstte tht it is possible to hieve high degees of both effiy nd effiieny in designing inementl mintenne lgoithms fo stutul indexes. We fous on thee types of updtes: edge insetion, edge deletion, nd subgph ddition. Edge insetion nd deletion onstitute the bsi opetions upon whih othe kinds of updtes (e.g., node insetion nd deletion) n be bsed. Although subgph ddition n lso be poessed by inseting edges one t time, we povide septe, moe effiient lgoithm fo it sine it is suh ommon opetion. We estit ouselves to the 1-index nd the A(k)-index, but we believe ou tehniques n lso be used fo othe stutul indexes bsed on node ptitioning. We develop effiient updte lgoithms fo the 1-index nd the A(k)-index tht, in ontst to pevious lgoithms, povide povble guntees on the esulting index qulity. Moe peisely, we mke the following ontibutions: 1. Ou lgoithms lwys mintin miniml index, i.e., meging ny two index nodes would esult in n inoet index. 2. If the dt gph is yli, we show tht thee is unique miniml 1-index tht is lso minimum, i.e., it hs the lest numbe of index nodes possible. This esult futhe ensues tht ou lgoithm lwys mintins the minimum 1-index fo yli dt gphs. 3. Fo yli dt gphs, whee thee might be moe thn one miniml 1-indexes, we show by expeiments tht the miniml 1-index mintined by ou lgoithm is lwys vey lose to the minimum, if not the sme. 4. Fo ny dt gph (yli o yli), we show tht thee is unique miniml A(k)-index tht is lso minimum, whih ensues tht ou lgoithm lwys mintins the minimum A(k)-index fo ny dt gph. 5. Though n extensive expeimentl study, we demonstte tht ou lgoithms e not only effetive in peseving index qulity, but lso vey effiient in tems of omputtion ost. The est of the ppe is ognized s follows. We fist suvey pevious wok in Setion 2. In Setion 3, we pesent the dt model of XML nd its stutul indexes, s well s the bsi onepts elted to the theoy nd lgoithms. We give genel oveview of ou lgoithms in Setion 4, followed by the detiled updte lgoithms fo the 1-index nd A(k)-index in Setion 5 nd 6, espetively. We study thei ptil pefomne expeimentlly in Setion 7. Finlly, we onlude in Setion PREVIOUS WORK Quey optimiztion fo XML hs been popul subjet of study [10, 5]. Indexing is essentilly used to void exhustive tvesl of the douments fo quey poessing. Signtue-bsed tehniques hve the sme gol of eduing the seh spe. They hve been used extensively in infomtion etievl nd hve lso been dpted fo XML dt eently in [14, 15]. With this ppoh, eh node of the XML tee is nnotted with the bitwise OR of the hsh vlues of its hild nodes. Existene of tg in the subtee of node n theefoe be estimted by omping the hshed vlue of the hild tg with the signtue of the node. Updtes my howeve led to eomputtion of signtues of ll nestos. Stutul summies fo XML hve been used fo indexing, quey puning nd ewiting, nd seletivity estimtion. DtGuides [6] ws one of the fist stutul summies used in XML quey poessing. The notion of simultion, moe ommonly used in gph theoy, ws pplied in [5] to shem vlidtion s well s quey puning nd ewiting fo semistutued dt. A numbe of stutul indexes bsed on simultion followed. The 1-index [11] ptitions dt nodes into equivlene lsses bsed on bisimility. To edue the size of the 1-index, the A(k)-index ws poposed [9]. It uses lol simility fo ptitioning, theeby ompessing the 1-index t the ost of losing some stutul infomtion bout the undelying dt. Vey eently, othe tehniques [17, 7] hve been poposed to futhe impove the flexibility nd effiieny of the A(k)-index. Thee impotnt issues need to be onsideed fo ny index: onstution, quey evlution, nd mintenne. Pige nd Tjn [12] gve n itetive splitting lgoithm to onstut 1-index in O(m log n) time, whee m is the numbe of edges nd n is the numbe of nodes in the dt gph. In [9], n lgoithm bsed on simil ides is used to onstut n A(k)-index in time O(km). Mny diffeent quey evlution sttegies use these stutul indexes; see [11, 9] fo detils. In this ppe, we only fous on the mintenne issue. The only known updte lgoithm fo the 1-index is the popgte lgoithm fom [8], whih uses Pige nd Tjn s onstution lgoithm [12] to hndle edge hnges. This lgoithm essentilly povides no guntee on the qulity of the esulting index. In the expeiments of [8], the index ws shown to hve 3% 5% moe nodes thn the minimum index fte eltively smll numbe of edge insetions (500 in dt gph with bout 200,000 nodes); no pefomne esults wee epoted fo deletions. A subgph ddition lgoithm bsed on eonstution ws lso given in [8]. Intuitively, beuse of its lolity, the A(k)-index should be esie to mintin thn the 1-index. Howeve, no good updte lgoithm fo the A(k)-index hs been poposed so f, exept fo some simple lgoithms mentioned in [8, 17]. These ppohes ll suffe fom the sme poblem of geneting too mny unneessy nodes, whih undemines the omptness dvntge of the A(k)-index. Designing effiient inementl mintenne lgoithms fo the A(k)- index ws left s n inteesting e fo futue eseh in [8]. 3. PRELIMINARIES Dt model. In this ppe, we model XML o othe semistutued dt s dieted, lbeled gph G =(V,E, oot, Σ, lbel, oid, vlue). Eh edge in E indites n objet-subobjet o IDREF eltionship. Eh node in V is lbeled with sting fom Σ vi the lbel funtion nd with unique identifie vi the oid funtion. It my lso optionlly hve vlue given by the vlue funtion. Thee is single oot node with the distinguished lbel ROOT with no inoming edges. An exmple XML doument unde this model is shown in Figue 1, whee objet-subobjet eltions e shown in solid lines, nd IDREF eltions e shown in dshed lines. A dtbse with multiple XML douments

3 0 1 oot egions people utions 5 6 fi si peson peson peson ution ution item item item item selle bidde bidde selle item site Figue 1: An XML dtbse exmple. n be modeled s single dt gph with n tifiil oot onneting gphs oesponding to the individul files. We efe to the nodes nd edges in V nd E s dt nodes nd dt edges, o dnodes nd dedges, espetively, to diffeentite fom those in n index gph, to be intodued below. We will use u,v,... to denote dnodes, nd Su(u) todenote the set of u s suessos, i.e., Su(u) ={v (u, v) E}. Stutul indexes. A stutul index (o stutue summy) fo dt gph tkes the fom of nothe lbeled dieted gph (V I,E I), whih is built by the following genel poedue: (1) ptition the dnodes into lsses oding to some equivlene eltion, (2) mke n index node (o inode) fo eh equivlene lss, with ll dnodes in this lss being its extent, nd (3) dd n index edge (o iedge) fom inode I to inode J if thee is dedge fom some dnode in the extent of I to some dnode in the extent of J. WeuseΦ(G) to denote stutul index built fo dt gph G, ndi[v] to denote the inode whose extent ontins dnode v. Fom now on, we will not distinguish between n inode nd its extent when thee is no onfusion. Sine stutul index is ompletely detemined by its ptition of the dnodes, we lso do not distinguish between n index nd its dnodes ptitions. We define Su(I) tobe S u ISu(u), the dnodes suessos of dnodes in I, ndisu(i) ={J (I,J) E I}, the index suessos of I. WeuseI,J,... to epesent sets of inodes, nd define Su(I) = S I I Su(I). Evlution of pth expessions n often be mde fste with stutul index Φ(G) by exeuting the pth expession R on Φ(G), whih is often muh smlle thn the oiginl dt gph G. The esults of R is ontined in the union of the extents of the inodes tht mth R, beuse ny stutul index tht is onstuted by the poedue bove is sfe. Howeve, not ll stutul indexes e peise, i.e., the esult of some quey R on Φ(G) myontin flse positives. Diffeent stutul indexes n be obtined by hoosing diffeent equivlene eltions in step (1) bove. The 1- index [11] uses bisimility [13] to ptition the dnodes. Fo ou pupose, we use the following equivlent definition fo the 1-index bsed on the notion of stbility [12]: Definition 1. An inode I is stble with espet to J if eithe I Su(J) oi Su(J) =. Fo dt gph G, n index Φ(G) isstble w..t. index Φ (G) if fo ny inode I Φ(G),I Φ (G), I is stble w..t. I. Definition 2. A stutul index Φ(G) is lled 1-index if (1) ll dnodes in ny inode of Φ(G) hve the sme lbel, nd (2) it is stble with espet to itself. A minimum 1-index is the 1-index with the minimum numbe of inodes. Note tht if I is not stble w..t. J, we n mke it stble by splitting I into I Su(J) ndi I Su(J). This is the bsi opetion fo ensuing the oetness of the index in the onstution lgoithm of [12] nd ou lgoithms. Thee my be moe thn one 1-index fo given dt gph, ll of whih n be used in the sme wy fo quey evlution. Of ouse they diffe in pefomne: the smlle the index, the bette the pefomne. The best one is the minimum 1-index, while the wost is the dt gph itself (lso vlid 1-index) whee we do not gin nything fom using it. In [12], the following esult gives the eltionship between the minimum 1-index nd othe 1-indexes. Definition 3. Fo dt gph G, stutul index Φ(G) is efinement of nothe index Φ (G) if fo ny inode I Φ(G), thee exists n inode I Φ (G) suh tht I I. Lemm 1. Thee is unique minimum 1-index fo ny given dt gph, nd ny othe 1-index is efinement of the minimum 1-index. Howeve, even the minimum 1-index n sometimes hve too mny inodes, espeilly fo highly iegul dt gphs, esulting in poo quey pefomne. To llevite the poblem, the A(k)-index [9] ws poposed to shink the index size by using k-bisimility to ptition dnodes. We use the following equivlent definition fo the A(k)-indexes. Definition 4. Given ny dt gph G, the A(0)-index is the stutul index obtined by simply ptitioning the dnodes of G by thei lbels. Fo 1 i k, stutul index Φ(G) is lled n A(i)-index if thee exists n A(i 1)- index Φ (G) suh tht Φ(G) is efinement of Φ (G) ndit is stble with espet to Φ (G). A minimum A(k)-index is the A(k)-index with the minimum numbe of inodes. Note tht the A(k)-index is not peise ny moe, beuse it only peseves pths of length up to k. Fo pth expessions longe thn k, it my genete flse positives nd we need vlidtion step on the oiginl dt gph to eliminte them. Nevetheless, in [9], it ws shown by expeiments tht even with this ext vlidtion step, the totl evlution ost is muh less thn tht of 1-index, due to the smll sizes of the A(k)-indexes, fo typil vlues of k =2,...,5. A esult pllel to Lemm 1 holds fo the A(k)-index [9]: Lemm 2. Fo ny given dt gph G, thee is unique minimum A(k)-index. Any othe A(k)-index is efinement of the minimum A(k)-index. Qulity of indexes. When thee e updtes to the dt gph G, it is sometimes diffiult nd ostly to mintin the minimum index, but s disussed befoe, thee e mny oet indexes nd ny of them n be used fo quey poessing in the sme wy s the minimum index. Howeve,

4 they nge fom the minimum index to the dt gph itself, hene diffe getly in pefomne [9, 17]. Thus, we would like to keep the index size s smll s possible when doing mintenne. To mesue the effetiveness of ou updte lgoithms, we define the qulity of the index to be # inodes in the index # inodes in the minimum index 1, whih we would like to keep s lose to zeo s possible. Note tht this is the sme meti used by [8] to mesue the qulity of the index fte sequene of updtes. 4. ALGORITHMS OVERVIEW The bsi ide behind ou new updte lgoithms is to itetively mke lol impovements fte oetness is fist ensued. All ou lgoithms onsist of split phse nd mege phse. Theefoe we will sometimes genelly efe to them s split/mege lgoithms. The split phse uses ides fom the index onstution lgoithms to fist mke the index oet by splitting some inodes, while the mege phse ties to mege neby inodes togethe without violting ny onstint, one pi t time, until no moe meges n be mde. Both split nd mege phses e ied out in n itetive nd lol mnne: we stt fom the newly inseted (o deleted) edge, nd poeed step by step. In eh step, we ty to split (o mege) the hilden of some new inode geneted fom pevious splits (o meges). The nie popety of ou lgoithms is tht, lthough these opetions e ied out in lol mnne, eh inode in the esulted index nnot be meged with ny othe inode without violting the stbility onstint. We sy tht suh n index is miniml. The peise definitions of miniml indexes tke slightly diffeent foms fo the 1-index nd the A(k)-index, so we will defe them to the espetive setions. Fom Lemms 1 nd 2, we know tht the minimum index is unique. Howeve, thee might be moe thn one miniml indexes fo given dt gph. Nonetheless, in mny ses we n pove tht thee is unique miniml index, i.e., fo yli 1-indexes nd genel A(k)-indexes. In these ses, ou lgoithms n futhe guntee tht the minimum index is lwys mintined. 5. UPDATES FOR THE 1-INDEX 5.1 Edge Insetion nd Deletion The lgoithms. We fist use unning exmple to demonstte how ou lgoithm updtes the 1-index when dedge is inseted into the dt gph. See Figue 2. The dt gph is shown in (), whee we use lettes to epesent lbels nd numbes to epesent dnodes. The new dedge to be inseted is shown with dshed line. The 1-index befoe the updte is shown in (b), whee the inodes extents e shown in bkets. The split phse fist heks if thee is n iedge between the two inodes ontining the soue nd sink of the new dedge. In this se thee is not, so we split the inode {3, 4} in (b) into n inode tht ontins dnode 4 nd one tht ontins the est of dnodes (Figue 2()). Then, this split tigges the split of inode {6, 7} beuse it now beomes unstble with espet to the two new inodes esulted fom the pevious split (Figue 2(d)). Now, evey inode is stble with espet to evey inode nd the split phse ends. The mege phse stts by looking fo n inode mong the siblings of {4}, the inode ontining the sink of the new dedge, to see if thee is n inode tht hs the sme lbel nd the sme set of index pents. We find inode {5} in this se ndthenmegeinodes{4}nd {5} togethe (Figue 2(e)). Next, we itetively onside the possible meges mong the hilden of newly geneted inodes fom pevious meges. In this exmple, we will mege inodes {7} nd {8} togethe. The finl esult of the updte is shown in Figue 2(f). Moe fomlly, ou lgoithm fist heks if the new edge (u, v) mkesvnot bisimil with the est of the dnodes in I[v]. If yes, we split I[v] into one inode ontining v itself nd one tht ontins the est of the dnodes. A ompound blok is set of inodes tht e the new inodes esulted fom pevious split. The split phse bsilly uses the Pige- Tjn s 1-index onstution lgoithm to itetively split inodes until we get stble ptition with espet to itself (hene oet). We stt with only one ompound blok onsisting of the two new inodes. In eh of the split steps, we tke out P ompound blok I, pik n inode I Isuh tht I 1 2 J I J, nd mke ll othe inode stble with espet to Su(I) ndsu(i {I}). This in tun my split othe inodes nd genete new ompound bloks, whih e dded to the queue of ompound bloks. The split phse ends when queue is empty. The mege phse stts fom I[v]ndtiestomegeinodes togethe itetively until no moe meges n be mde. We fist look fo n inode with the sme lbel nd index pents s I[v]. If one exists, we mege it with I[v], nd put the newly meged inode into queue of meged inodes. In eh of the following mege steps, we tke out one meged inode I fom the queue, nd onside the possible meges mong the index suessos of I. We lso dd newly meged inodes into the queue. The mege phse ends when the queue is empty. Ou omplete 1-index edge insetion lgoithm is desibed in Figue 3. The edge deletion lgoithm diffes only slightly. Fo simpliity of pesenttion, we ssume tht thee is no self-yles in the 1-index (i.e., n inode tht points to itself), whih is tue fo vitully ll XML dtbses. Ou lgoithms n be modified to tke e of self-yles s well, only tht some detils get little messy. Note tht in the lgoithm desiption we only speify how the ptition of dnodes gets updted; we do not bothe to stte expliitly how iedges e hndled, beuse they e ompletely detemined fom the ptition by the definition of stutul indexes (Setion 3). These iedges n lso be esily mintined s we updte the inode extents, using tehniques simil to those in [8]. Effiy. Now we give the foml definition fo miniml 1- indexes. Definition 5. Fo dt gph G, 1-index Φ(G) isminiml if fo ny two inodes I,J Φ(G), eithe (1) they hve diffeent lbels, o (2) thee exists n inode K Φ(G) suh tht I J is not stble with espet to K. Fo exmple, fo the dt gph in Figue 2() fte the dedge insetion, the index in 2(f) is miniml 1-index (nd minimum t the sme time), the ones in 2(d) nd (e) e not miniml, nd the one in 2() is not even vlid 1-index. Note tht 1-index is miniml if nd only if it hs no two inodes tht hve the sme lbel nd the sme set of index

5 0 1 2 b d d d 7 8 {3,4} {5} d d {6,7} {8} {3} {4} {5} d d {6,7} {8} () Dtgph (b) old 1-index () split phse begins {3} {4} {5} d d d {6} {7} {8} (d) split phse ends {3} {4,5} {3} {4,5} d d d d d {6} {7} {8} {6} {7,8} (e) mege phse begins (f) mege phse ends Figue 2: An exmple of updting the 1-index fte dedge insetion. poedue inset 1 index edge(u, v) begin dd dedge fom u to v; if thee is n iedge fom I[u] toi[v]then etun; /* eple the 2 lines bove with the following fo deletions: delete the dedge fom (u, v); if thee exist u I[u],v I[v] nd thee is dedge fom u to v then etun; */ // split phse Q = ; if I[v] > 1 then split I[v] intoi 1={v}nd I 2 = I {v}; Q={{I 1,I 2}}; while Q do pik ny I Q, emove it fom Q; P J I J ; pik I Is.t. I 1 2 if I 3 then inset I {I}into Q; foeh inode K ISu(I) do split K into K 1 = K Su(I) ndk 2=K K 1; split K 1 into K 11 = K 1 Su(I {I})nd K 12 = K 1 K 11; let K = {K 11,K 12,K 2} { }; if K 2 then if J Q s.t. K J then eple K in J with the inodes in K; else dd K to Q; // mege phse Q = ; look fo n inode J with the sme lbel s v mong I[v] s siblings tht hve the sme set of index pents s I[v]; if suh n inode J exists then mege I[v] ndjinto K = J I[v]; Q = {K}; while Q do pik ny I Q, emoveifom Q; let I = ISu(I); ptition I into equivlent lsses oding to thei lbels nd index pents; foeh equivlent lss J Ido if J 2 then mege the inodes in J into J = S J ; Q = Q J; inset J into Q; end Figue 3: Inset n edge into 1-index b 4 b () Dtgph {3,4} b (b) minimum 1-index {1,2} () miniml (but not minimum) 1-index {2} {3} b {4} b Figue 4: Miniml 1-indexes might not be unique. pents, whih follows dietly fom the definition of stbility. Miniml 1-indexes might not be unique. Fo exmple, the indexes in Figue 4(b) nd 4() e both miniml 1- indexes fo the dt gph in 4(), but only the one in 4(b) is minimum. Lemm 3. If the 1-index befoe the updte is miniml, the new index geneted by the split/mege lgoithm is lso miniml 1-index. Poof. Let (u, v) be the dedge just inseted (o deleted). The lgoithm fist heks if this edge updte hnges ny index pedeesso-suesso eltions. If no, it simply etuns. The esulted index is still miniml 1-index simply beuse the index befoe the updte is miniml 1-index. Assume now the updte indeed uses some hnges to the index. Let us ll the dt gph befoe the updte G 0, nd the one fte the updte G 2. Imgine we elbel v of G 2 with new lbel tht is diffeent fom ll othes, nd ll this dt gph G 1. We ll the 1-index befoe the updte Φ 0(G 0), the one fte the split phse but befoe the mege phse Φ 1(G 1) (with elbeled v), nd the finl 1-index Φ 2(G 2). We will show tht if Φ 0(G 0) is miniml 1-index, then Φ 1(G 1)ndΦ 2(G 2) e both miniml 1-indexes, too. If v is in n inode by itself in Φ 0(G 0), the split phse does nothing. In this se, the only inode in Φ 1(G 1)thtmy hve diffeent set of index pents thn in Φ 0(G 0)isI[v], whih by definition, hs distinguished lbel in Φ 1(G 1), theefoe it nnot be meged with ny othe inode. So Φ 1(G 1) is miniml in this se. Suppose othewise tht v shes n inode with some othe dnodes in Φ 0(G 0). Afte insetion, v hs diffeent set of index pents thn these othe dnodes, nd is then singled out by the split phse, whih ftewds popgtes the split using the Pige-Tjn s lgoithm. Tht Φ 1(G 1) is indeed oet 1-index follows fom the oetness of the Pige-

6 Tjn s lgoithm, whih lwys etuns the osest selfstble efinement of the stting ptition. To see it is lso miniml, we define the index pents of dnode w to be the set of inodes, eh of whih ontins t lest one of w s pents, i.e., the set {I[w ] w Su(w )}, nd we will show tht the split phse mintins the invint tht no two dnodes in diffeent inodes hve the sme lbel nd the sme index pents. Note tht if the index is vlid 1- index, the index pents of ny dnode w e the sme s the index pents of I[w], so this invint is tue in 1-index if nd only if this 1-index is miniml. The invint is tue befoe the split phse beuse Φ 0(G 0) is miniml 1-index. It is still mintined when we elbel v nd single it out s septe inode, beuse v s lbel is now diffeent fom ny othes. In eh of the following split steps, wheneve we split n inode into two, the newly geneted two inodes must hve t lest one diffeent index pent othewise they will not get split. Sine we ledy know Φ 1(G 1)is 1-index, it is lso miniml beuse the invint holds. Next, we need to show tht Φ 2(G 2) is miniml 1-index, whih is the esult of lbeling v bk to its oiginl lbel nd pplying the mege phse on Φ 1(G 1). It is esy to see tht it is 1-index beuse the mege phse only meges inodes tht hve the sme lbel nd index pents. Sine Φ 1(G 1) is miniml, nd the only diffeene between G 1 nd G 2 is v s lbel, the only possible two inodes tht my hve the sme lbel nd the sme index pents in Φ 1(G 2)e I[v] nd some othe inode. The mege phse extly stts by looking fo this only possible mege. Futhe notie tht fte two inodes e meged, it n only tigge new possible meges mong the inode suessos of the newly meged inode beuse the index pents of ll othe inodes emin unhnged. Theefoe, when the mege phse ompletes, no two inodes in Φ 2(G 2) n be meged, so Φ 2(G 2) is miniml. Keeping the 1-index miniml is pobbly the best one n do with esonble ost, sine it is muh hepe to hek if the 1-index is miniml, s ou lgoithms do, thn to detemine if it is minimum. Fo exmple, in ode to find out the 1-index in Figue 4() is not minimum, we need to be ble to detet two meges simultneously, nd the numbe of suh simultneous meges might be s high s Θ(n). In ptie, it is often good enough to keep the 1-index miniml, nd in mny ses, the miniml 1-index indeed tuns out to be the minimum 1-index. Even if we e unluky to get stuk in miniml 1-index tht is not minimum, ou expeiments show tht the diffeene between the two is often vey smll. Mny dt gphs e yli. Fo exmple, in bibliogphy dtbse, if we wnt to model the efeene eltions with IDREF edges, it is n yli gph s ppe n only efeene ppes tht ppe elie in time. Mny othe XML dtbses tht model hiehil eltions e ntully yli, o even tees. Fo suh dtbses, ou lgoithms n povide n even stonge guntee tht the minimum 1-index is lwys mintined, beuse the miniml 1-index is unique in this se. Lemm 4. Fo ny yli dt gph G, thee is unique miniml 1-index Φ(G), whih is lso minimum. Poof. Fist we showthtny1-indexofgis lso yli. Suppose thee ws yle in the 1-index. By definition, fo ny iedge I J, ny dnode in J hs t lest one pent in I. By following iedges bkwds in yle, we know thee exists pth of bity length in G, whih ould only hppen if G is yli, too. Suppose tht Φ(G) is the minimum 1-index nd Φ (G) is miniml 1-index diffeent fom Φ(G). We ode the inodes in Φ(G) topologilly nd pik the fist inode I tht does not ppe in Φ (G). By Lemm 1, Φ (G) is efinement of Φ(G), so thee exists t lest two inodes I 1,I 2 Φ (G)suh tht I 1 I,I 2 I nd I 1 nd I 2 hve the sme lbel. Fo ny index pent J of I, J lso ppes in Φ (G) beuse J is befoe I in the topologil ode. Then J is lso n index pent of I 1 nd I 2 in Φ (G) beuse eh dnode in I hs t lest one pent in J. Fo ny inode J tht is not n index pent of I, J nnot be n index pent of I 1 o I 2,eithe, beuse J does not ontin ny pent of ny dnode in I, nd both I 1 nd I 2 e subsets of I. So I 1nd I 2 hve the sme set of index pents, whih e the sme s those of I. This ontdits with the ft tht Φ (G) is miniml. CombiningLemm3nd4,wehve: Theoem 1. Fo yli dt gphs, the split/mege lgoithm lwys mintins the minimum 1-index duing edge insetions nd deletions. Fo yli dt gphs it lwys mintins miniml 1-index. Effiieny. Theoem 1 gives theoetil guntee on the effiy of the split/mege lgoithm, but how ostly it is in tems omputtion ost? We ontinue to use Φ 0(G 0), Φ 1(G 2), Φ 2(G 2) to denote the index befoe the updte, between the split nd mege phse, nd fte the updte, espetively. It is esy to see tht the numbes of split nd mege opetions e Φ 1(G 2) Φ 0(G 0) nd Φ 1(G 2) Φ 2(G 2), espetively. The fist pt is essentilly the ost of the popgte lgoithm, while the seond pt is the minimum numbe of meges equied to shink the intemedite esult down to miniml. Unfotuntely, in the wost se, this intemedite index Φ 1(G 2) ould hve muh moe nodes thn the index befoe o fte the updte. See fo exmple Figue 5, whee the tingles epesent two subtees with the sme stutue. By bitily enlging these subtees, we n hve n intemedite index tht hs Ω(n) moenodes thn the old o the new index. This is lso poblem to the popgte lgoithm nd ws identified in [8]. Nevetheless, the wost-se exmple of Figue 5 is the ontived nd is e in ptie. As obseved by [8], s well s ou own expeiments with both el-life nd benhmk dt, the intemedite index on vege only hs 0.01% moe nodes, whih mens tht the updte lgoithm is elly inementl, opeting only on vey smll ftion of the whole index. Sine we hve n dditionl mege phse, the ost of the split/mege lgoithm is etinly highe thn the popgte lgoithm, but we feel the mege phse is lwys woth doing, not only beuse it gives us nie theoetil guntee on the qulity of the esulted index, but lso fo the following ptil onsidetions: (1) With the mege step, we n effetively keep the index size smll, leding to muh loweed eonstution fequeny. (2) The mege phse lwys mkes the index smlle, hene highe quey pefomne. Typilly we hve moe queies thn updtes, so the effot spent in impoving the qulity of the index vey likely n be pid bk by the svings fom subsequent quey evlutions.

7 0 1 2 b {3} {4,5} 6 d 7 d {6,7} d poedue dd 1 index subgph(g ) begin build the 1-index Φ (G ) fo the new subgph G ; union Φ (G ) with the uent 1-index Φ(G); dd ll inoming dedges tht go into, the oot of G ; do mege phse of inset 1 index edge stting t I[]; foeh othe dedge (u, v) between G nd G do inset 1 index edge(u, v); end Figue 6: Add subgph in 1-index. Figue 5: se. () Dtgph () Intemedite 1-index (b) Old 1-index {3} {4} {5} {3,4} {5} {6} {7} d d {6,7} d (d) Finl 1-index Updte ost ould be high in the wost Finlly, s n implementtion note, when we split inodes using Su(I) (o Su(I {I})), we in ft n split ll inodes ontining t lest one dnode in Su(I) tthesme time by snning Su(I) one nd eting K Su(I) fo eh K. The sme tehnique is used in [12, 8]. 5.2 Subgph Addition We model subgph lso s lbeled, ooted dieted gph, whih n etinly be dded by inseting its dnodes nd thei inident dedges one by one using ou edge insetion lgoithm. But sine subgph ddition ous so fequently, we design moe effiient lgoithm tht pefoms the insetions in bthed mnne. The bsi ide is to build the 1-index fist fo the new subgph, nd then dd ll the edges between the new subgph nd the existing dt gph using the edge insetion lgoithm. Note tht the oot of the new subgph must be in n inode by itself in the 1-index of the subgph. As n optimiztion, we n inset ll the inoming edges to the oot of the subgph nd then pefom the mege phse just one. The lgoithm dd 1 index subgph is shown in Figue 6. The following oolly follows fom Theoem 1. Coolly 1. Algoithm dd 1 index subgph mintins the minimum 1-index fo yli dt gphs nd miniml 1-index fo yli dt gphs. Ntully one would like to delete subgphs effiiently s well. This is esy, too. Hve speil node with distinguished lbel DELETE, nd dd dedge fom this node to the oot of the subgph tht we wnt to delete. This new dedge will single out this subgph fom the est of index, nd then we n just delete it fom the index. 6. UPDATES FOR THE A(K)-INDEX The lgoithm. Ou ides nd tehniques fo updting the 1-index n be extended to hndle updtes fo the A(k)- index s well. As identified in [8], the A(k)-index is diffiult to mintin by itself beuse updting it equies infomtion ontined in n A(k 1)-index. Thus, the bsi ide in ou lgoithm is to mintin ll the A(0), A(1),..., A(k)-indexes togethe using ou 1-index updte lgoithms. When mintining the A(i)-index, we use the A(i 1)-index s efeene to mke split nd mege deisions. We will fist desibe the lgoithm, nd then disuss how to implement it in spe- nd time-effiient mnne. Note tht ll these A(i)-indexes n be esily while we we build the A(k)-index; in ft, the onstution lgoithm [9] builds ll the A(0), A(1),..., A(k)-indexes in ode. In the following, we only onside edge insetions nd deletions; subgph ddition n be done in vey simil wy s we did fo the 1-index. The A(k)-index updte lgoithm lso onsists of split phse to guntee oetness nd mege phse to ensue minimlity. Suppose the new edge to be inseted (o deleted) is (u, v). We fist look fo the lgest i suh tht the A(i)- index will not be ffeted by the edge updte. The split phse fist etes new inode ontining v itself fo eh of the A(i + 1) toa(k)-indexes. These initil splits genete numbe of ompound bloks (in the 1-index, we hve only one ompound blok t the beginning), nd we put them in queue. Aftewds, we itetively split othe inodes in wy vey simil to wht we did fo the 1-index. The only diffeene is tht, when we stbilize othe inodes with espet to ompound blok in the A(i)-index, we need to onside ll the inodes in the A(i +1) to A(k)-indexes. The mege phse lso poeeds similly s fo the 1-index. Fo eh of the ffeted A(i)-indexes, we fist ty to mege the inodes ontining v with nothe inode. Next, we itetively mege othe inodes togethe. In eh step, if I is new inode in the A(i)-index geneted fom pevious mege, we onside the possible meges mong the inodes in the A(i+1)-index tht ontins t lest one dnode with pent in I. The detiled edge insetion (nd deletion) lgoithm fo the A(k)-index is shown in Figue 7. We use Φ (i) (G) to denote the A(i)-index of dt gph G; I (i),j (i) e some inodes in the A(i)-index; I (i) [v] denotes the inode in the A(i)-index tht ontins dnode v. We lso use I (i), J (i) to denote sets of inodes in the A(i)-index.

8 poedue inset A(k) index edge(u, v) begin find the lgest i s.t. v Su(I (i) [u]), if no suh i exists, set i = 1; dd dedge fom u to v; /* eple the thee lines bove with the following fo deletions: delete the dedge fom u to v; find the lgest i s.t. v Su(I (i) [u]), if no suh i exists, set i = 1; */ // split phse begins Q = ; fo j = i +2 to k if I (j) [v] > 1 then split I (j) [v] intoi (j) 1 ={v}nd I (j) 2 = I (j) [v] {v}; if j k 1 then inset {I (j) 1,I(j) 2 } into Q; // itete to split othes while Q do pik ny I (j) Q with the smllest j; emove I (j) fom Q; pik I (j) I (j) s.t. I (j) 1 P 2 J (j) I (j) J(j) ; if I (j) 3then inset I (j) {I (j) }into Q; foeh inode K (l),j+1 l k do split K (l) into K (l) 1 = K (l) Su(I (j) ) nd K (l) 2 = K (l) K (l) 1 ; split K (l) 1 into K (l) 11 = K(l) 1 Su(I (j) {I (j) }) nd K (l) 12 = K(l) 1 K (l) 11 ; if l k 1 then let K (l) = {K (l) 11,K(l) 12,K(l) 2 } { }; if K (l) 2then if J (l) Q s.t. K (l) J (l) then eple K (l) in J (l) with the inodes in K (l) ; else dd K (l) to Q; // mege phse begins fo j = i +2 to k do Q = ; look fo inode I (j) I (j 1) [v] s.t.i (j) hs the sme index pents in the A(j 1)-index s I (j) [v]; if suh inode I (j) exists then mege I (j) [v] ndi (j) with J (j) = I (j) [v] I (j) ; if j k 1 then inset J (j) into Q; // itete to mege othes while Q do pik ny I (l) Q with the smllest l; emove I (l) fom Q; let I (l+1) = {I (l+1) (w) w Su(w ),w I (l) }; ptition I (l+1) into equivlent lsses oding to thei lbels nd index pents in the A(l)-index; foeh equivlent lss J (l+1) I (l+1) do if J (l+1) 2then mege the inodes in J (l+1) into J (l+1) = S J (l+1) ; if l k 2 then Q = Q J (l+1) ; inset J (l+1) into Q; end Figue 7: Inset n edge into A(k)-index. Effiy. Sine we e essentilly using ou 1-index updte lgoithm to mintin Φ (i) (G) with espet to Φ (i 1) (G) fo ll i =1,...,k, we n show tht ou lgoithm mintins miniml set of A(i)-indexes in the following sense. Definition 6. Fo ny dt gph G, theset of A(i)-indexes Φ (0) (G), Φ (1) (G),...,Φ (k) (G)eminiml if fo ll 1 i k, meging ny two inodes of Φ (i) (G) will mke it unstble with espet to Φ (i 1) (G). Lemm 5. The split/mege lgoithm lwys mintins miniml set of A(i)-indexes. Poof. Follow the sme lines of esoning in the poof of Lemm 3. Sine the set of A(i)-index is built in hiehil mnne, whih esembles the ntue of yli 1-indexes, we hve the following esult fo the A(k)-index fo ny genel dt gph. Lemm 6. Fo ny dt gph G, thee is unique minimlsetofa(i)-indexes, eh of whih is lso minimum. Poof. Let Φ (0) (G),...,Φ (k) (G) be the minimum A(i)- indexes of G, ndψ (0) (G),...,Ψ (k) (G) be ny miniml set of A(i)-indexes. We will show by indution tht Φ (i) (G) = Ψ (i) (G) fo ll i. The bse se Φ (0) (G) =Ψ (0) (G) holds by definition. Now suppose Φ (i) (G) =Ψ (i) (G), then meging ny two inodes in Ψ (i+1) (G) will mke it unstble with espet to Φ (i) (G). By Lemm 2, Ψ (i+1) (G) is lwys efinement of Φ (i+1) (G). If Φ (i+1) (G) Ψ (i+1) (G), we would find t lest two inodes in Ψ (i+1) (G) tht e ontined in the sme inode of Φ (i+1) (G), meging these two inodes would not use Ψ (i+1) (G) to be unstble with espet to Φ (i) (G). So we hve Φ (i+1) (G) =Ψ (i+1) (G). CombiningLemm5nd6,wehve: Theoem 2. Fo ny dt gph G, the split/mege lgoithm lwys mintins the minimum A(k)-index. Effiieny. As fist impession, mintining ll the A(0) to A(k)-indexes would tke lot of spe nd inese the updte ost. Below we desibe stutue lled the efinement tee, whih is designed to exploit the ft tht the A(i+1)-index is lwys efinement of the A(i)-index. With this tee ( foest in genel) we n mintin the A(i)-index on top of the A(i + 1)-index, insted of mnipulting mssive sets of dnodes dietly. The efinement tee inludes ll the nodes in the A(0) to A(k)-indexes. Tee edges e built by linking eh inode in the A(i)-index to the inodes in the A(i + 1)-index tht e ontined in this inode (Figue 8). With this tee stutue, thee is no longe ny need to stoe the extents of the inodes in ll the A(i)-indexes fo 0 i k 1, s they n be fully eoveed fom the extents of A(k)-index nodes. Let us now see how to use the efinement tee to implement the lgoithm inset A(k) index edge, o moe peisely, the two bsi opetions split nd mege. Meges e esy: If we mege two A(k)-index inodes, we mege thei extents s we did fo the 1-index. If we mege two A(i)-index inodes fo 1 i k 1, we simply mege them togethe

9 () Dtgph b {3,4,5,6,7,8} (b) A(0) {3} {4,5} {6,7,8} () A(1) {3} {4,5} {6} {7,8} (d) A(k=2) Figue 8: Refinement tee: tee edges e shown in dotted lines. without ny opetion on thei extents; ll A(i + 1)-index inodes tht wee hilden of the two old nodes in the efinement tee now beome the hilden of the new node. Splits need moe e. Thee e two kinds of splits: the initil splits t the beginning of the split phse nd the noml splits using Su(I (j) )osu(i (j) {I (j) }) (ef. Figue 7). All initil splits togethe ete one new inode ontining only v fo eh of the A(j)-indexes, j = i +2,...,k, so we just need to split I (k) [v] nd then ete new node on eh level j of the efinement tee, pointing only to the new tee node on level j +1. Fo noml splits using, sy Su(I (j) ), we sn though Su(I (j) ) nd split ll inodes whose extents inteset tht of Su(I (j) ) t the sme time. Fo eh dnode w Su(I (j) ), thee is extly one inode in the A(l)-index tht ontins w, fo l = j+1,...,k. These inodes, denoted by K (j+1),...,k (k), fom pth in the efinement tee. Fo the A(k)-index inode K (k), we y out the sme poedue s with the 1-index: ete n A(k)-index inode ˆK (k) fo w if neessy (it might hve been eted ledy while poessing n elie dnode tht is k-bisimil to w), nd then move w fom K (k) to ˆK (k). Fo l = k 1,...j + 1, we ete n A(l)-index inode ˆK (l) fo w if neessy, nd then mke ˆK (l+1) hild of ˆK (l) in the efinement tee. Afte ll dnodes in Su(I (j) ) e snned, we emove ny empty inodes fom the A(k)- index, nd then ny A(l)-index inodes with no hilden in the efinement tee, fo l = k, k 1,...,j+ 1. Afte ll pis e poessed, ll splits with espet to Su(I (j) ) e ompleted. The sme poedue pplies to Su(I (j) {I (j) }). Note tht in this wy we only del with the dnodes in the A(k)-index; mintenne of the A(i)-index only involves inodes in the A(i+1)-index, nd the ost of doing so deeses pidly s i gets smlle. Apt fom the efinement tee edges, thee e two types of iedges we need to mintin: the noml int-iedges inside the A(k)-index, used fo quey poessing, nd the inte-iedges in the efinement tee tht onnet inodes in A(i) to thei inode suessos in A(i + 1), whih e equied in ode fo the mintenne lgoithm to funtion effiiently. Both types of iedges n be mintined heply duing the split/mege poess. Optionlly, one ould lso mintin the int-iedges inside the A(i)-indexes fo i = 1,...,k 1, whih will speed up the evlution of pth expessions of length less thn k, but we will not exploe this option futhe in this ppe. Although we stoe moe infomtion thn the A(k)-index lone, the ext stoge ovehed is low. We stoe eh dnode only one (in the extent of n A(k)-index inode), nd we use only one hsh tble fo the evese mpping fom the dnodes to the A(k)-index inodes. Fo the A(i)-indexes whee i<k, we stoe only the efinement tee edges nd the inte-iedges. Sine the numbe of inodes in the A(i)-index deeses pidly s i gets smlle, this stoge ovehed is insignifint omped with the ost of stoing extents nd the dnode-to-inode mpping, whih must be pid by stnd-lone A(k)-index s well. 7. EXPERIMENTS In this setion, we pesent ou expeimentl study omping ou lgoithms with pevious methods. All lgoithms e implemented in Jv. The mhine used fo expeiments is Dell PoweEdge 2600 with 2.4GHz Xeon poesso nd 1GB of RAM, unning Linux with JDK Ou mhine hs enough memoy to stoe eveything nd no pging is needed duing exeution. We use the sme pefomne metis s pevious woks [8, 17], i.e., we mesue effiy in tems of the qulity of the index s defined in Setion 3, nd effiieny in tems of the wll-lok unning time. We use both benhmk nd el-life XML dtbses in ou expeiments. The XMk dtbse is geneted by the XMk geneto fom the XML Benhmk Pojet [2]. It is highly yli nd iegul dtbse likely to stess the use of stutul indexes. It is 11.7MB in size nd onsists of 167,865 dnodes nd 198,612 dedges, mong whih 30,747 e IDREF edges. A smple of this dtbse is shown in Figue 1. Cyles in this dtbse e used by lge numbe of peson-ution edges. To see how ou lgoithms hndle dt gphs with yles, we intentionlly emove potion of those edges to vy the yliity, whih we define to be the ftion of suh edges emining. We nme these dt sets XMk() wheeis the yliity; e.g., XMk(1) is the oiginl XMk dtbse, nd XMk(0) ontins no peson-ution edges nd thus no yles, lthough they hve the sme numbe of dnodes. The el-life dtset is extted fom the Intenet Movie Dtbse (IMDB) [1] in the following wy: Fist we ndomly hoose smll subset of movies nd ll people (tos, dietos, et.) ssoited with these movies. We then extt ll othe movies ssoited with these people, nd ontinue this poess until the desied dtbse size is ehed. Fo eh movie o peson, we lso extt substntil mount of othe infomtion (e.g. title, ye, gene). This dtset onsists of 272,567 dnodes nd 285,221 dedges, mong whih 12,654 e IDREF edges. Ovell, it is lso highly yli nd iegul dtset. 7.1 Expeiments on the 1-Index Edge insetions nd deletions. Fo the 1-index edge insetions nd deletions, we ompe with the popgte lgoithm [8]. In this set of expeiments we pply mixed sequene of edge insetions nd deletions on both the XMk nd IMDB dt. Fo XMk, we selet fou dtsets with yliities 1, 0.5, 0.2 nd 0, to see how the lgoithms pefom, sine s suggested by Theoem 1, the pefomne might be ffeted by yles. In ode to genete edge insetions in meningful wy, we fist emove 2 of ll the IDREF edges fom the dt

10 index qulity split/mege popgte popgte + eonstution # (insetion, deletion) pis Figue 9: 1-index qulity ove mixed edge insetions nd deletions on IMDB. gph. These deleted edges then beome pool of possible insetions. Using the esulting dt gph s the stting point, we pefom one edge insetion followed by one edge deletion in eh step: fist ndomly seleted edge is emoved fom the pool nd inseted into the dt gph, nd then nothe ndomly seleted edge is deleted fom the dt gph nd put bk into the pool. Fo eh dtset, 5000 pis of edge insetions nd deletions e pefomed. Sine the popgte lgoithm does not hve ny guntee on the qulity of the updted index, the index gets pogessively wose ove time nd it is neessy to eonstut the index peiodilly. We used the index eonstution ide of [8], i.e., un the onstution lgoithm on top of the index gph (teting it s dt gph), nd then blow up eh inode of the new index by epling eh inode of the old index with its extent of dnodes. Sine we do not know how big the uent minimum index is duing the ouse of sequene of updtes, we use the following simple heuisti to tigge index eonstutions: emembe the size of the index when it ws lst eonstuted, nd then pefom eonstution wheneve the uent index is moe thn 5% lge thn tht. Sine ou split/mege lgoithm does not guntee the minimum 1-index on yli dt gphs, eithe, we use the sme heuisti to tigge eonstution. Results fo IMDB e shown in Figue 9. Pefomne of popgte fo the fist 500 edge updtes gees with the peviously epoted esults [8] vey well: ound 5% inese in index size. In ft the esult epoted in [8] ws little bette thn this, whih n be explined by the ft tht [8] only did edge insetions, while edge deletions e little moe diffiult to hndle beuse the minimum 1-index itself usully shinks when edges e deleted. Afte tht, we see tht its index qulity ontinues to degde lmost linely with the numbe of edge updtes pefomed. Thus, eonstution is tiggeed one bout evey 500 updtes. On the othe hnd, ou split/mege lgoithm mintins the index qulity vey well, neve exeeding 3%. This expeiment shows tht the miniml 1-index mintined by ou lgoithm is in ft vey lose to minimum fo this dtset. Results fo XMk e shown in Figue 10. An inteesting ft is tht on these dtsets, ou split/mege lgoithm pefoms extemely well: its qulity uves vitully emin zeo (neve exeeding 0.5%). The eson is tht the IDREF edges in the XMk dtsets e geneted moe unifomly, while in IMDB they tend to be lusteed: elted pesons e likely to get involved in elted movies, eting shote yles tht mke ses simil to Figue 4 moe likely thn in XMk. Fo popgte, we see simil tends fo ll dtsets: its qulity uves lmost lwys gow linely, lthough the te vies lot fo diffeent yliities: on XMk(1), the index qulity is still bette thn 12% fte edge updtes, but on XMk(0) it gets wose vey quikly. The eson is tht XMk(1) is highly iegul dtset; even the size of its minimum 1-index is moe thn 4 of its dt gph size. Fo suh big index, thee e vey few possible meges duing updtes, so popgte lgoithm pefoms eltively well. Howeve, suh lge 1-indexes usully led to bd quey pefomne, nd we usully tun to othe smlle indexes, suh s A(k), fo these ses. As the yliity deeses, the dt gph lso gets moe egul, nd the minimum 1-index shinks. The popgte lgoithm then hs inesing diffiulty in keeping the index fit, nd hs to pefom moe fequent eonstutions. We lso mesued the vege unning times ove the edge updtes fo eh dtset. Fom Figue 11 we n see tht the split/mege lgoithm is moe ostly thn the popgte lgoithm, due to the ext mege phse, but it beomes muh fste if we fto in the motized eonstution ost (totl eonstution ost divided by 10000). Notie tht yliity does not seem to ffet the pefomne of the split/mege lgoithm, showing tht ses like Figue 5 e not ommon. Finlly, note the index is essentilly unusble duing the eonstution, while ou split/mege lgoithm lwys esponds quikly, theeby mking the index moe vilble fo queies. Subgph dditions. We lso ondut expeiments on subgph dditions with the XMk dt. We extt subgphs in the following mnne. Fist we ndomly selet n ution dnode u, nd then pefom tvesl down stting fom u to extt ll desendents of u, whihfom subgph. We do not tvese IDREF edges beuse we wnt to void yles, nd lso beuse the IDREF edges usully epesent inte-objet eltionships tht e not integl pts of the entity of inteest. In this wy we extt 500 subgphs, with n vege size of 50 dnodes. Fo eh dtset, we fist delete ll these subgphs, nd then inset them one by one. We ompe thee ltentives: (1) ou lgoithm of Setion 5.2, (2) sme lgoithm but using popgte insted of inset 1 index edge to inset the edges, nd (3) the index eonstution lgoithm of [8], whih lwys mintins the minimum 1-index but is extemely ostly. We obtin lmost the sme esults gin: Ou lgoithm keeps the qulity of 1-index t lmost ll the time, while the seond ltentive keeps inesing the index size nd is vey sensitive to the stutue of the dt gph (Figue 12). In tems of unning ost, the fist two ltentives e both vey fst, bout 20 mse fo eh subgph; the thid one is moe thn 100 times slowe beuse of the ostly eonstution. 7.2 Expeiments on the A(k)-Index Sine we hve theoetil guntee tht ou split/mege lgoithm lwys mintins the minimum A(k)-index, the expeiments on the A(k)-index e minly imed t effiieny issues, nmely the unning ost nd ddition stoge ovehed esulted fom mintining ll the A(i)-indexes fo 0 i k. In the expeiments, we vied k fom 2 to 5, oveing the nge of k s tht give the best pefomnes s

11 index qulity 12% 1 8% 6% 4% 2% split/mege popgte popgte + eonstution unning time pe updte (mse) split/mege motized eonstution popgte index qulity 25% 2 15% 1 5% # (insetion, deletion) pis () XMk(1) split/mege popgte popgte + eonstution 0 XMk(1) XMk(0.5) XMk(0.2) XMk(0) IMDB Figue 11: Running times of 1-index lgoithms. index qulity XMk(1) XMk(0.5) XMk(0.2) XMk(0) # (insetion, deletion) pis 6 5 (b) XMk(0.5) split/mege popgte popgte + eonstution # subgph dded Figue 12: 1-Index qulity duing sequene of subgph dditions with the popgte lgoithm. 4 index qulity index qulity # (insetion, deletion) pis () XMk(0.2) split/mege popgte popgte + eonstution # (insetion, deletion) pis (d) XMk(0) Figue 10: 1-index qulity ove mixed edge insetions nd deletions on XMk. epoted by [9]. The yliities of dtsets e not tweked hee beuse the pefomne of ou lgoithms e not ffeted by yles. We only pesent the expeiments on edge insetions nd deletions fo the A(k)-index in this ppe. We ompe with the following simple lgoithm, obtined by fixing mino mistke in the one mentioned t the end of [17]. Afte n dedge (u, v) is inseted o deleted, we do bedth-fist seh to find ll the potentilly ffeted dnodes in the dt gph. These dnodes e desendnts of v up to mximum depth of k 1. The oesponding inodes ontining these dnodes e possibly unstble nd need to be ptitioned into new inodes oding to k-bisimility. Sine the A(k)-index does not etin enough infomtion to ompute k-bisimility, we hve to go bk to the dt gph nd ompute by definition. Notie tht the ost of this simple lgoithm is exponentil in k. Sine this lgoithm does not povide ny guntee on the index qulity, we lso onside the option of peiodi index eonstutions in the expeiments, like wht we did with the 1-index. Fo the expeiments, we only pefom 1000 pis of insetions nd deletions sine it is ledy enough to see le tend. The simple lgoithm, s expeted, blows up the index size pidly without eonstutions, espeilly fo smll k s. The esult on the XMk dtbse e shown in Figue 13. The esult on IMDB is simil nd omitted. When the eonstution theshold is set to 5%, this simple lgoithm tigges fequent eonstutions, s shown in Tble 1. Running times of ou split/mege lgoithm nd this sim-

Chapter 7. Kleene s Theorem. 7.1 Kleene s Theorem. The following theorem is the most important and fundamental result in the theory of FA s:

Chapter 7. Kleene s Theorem. 7.1 Kleene s Theorem. The following theorem is the most important and fundamental result in the theory of FA s: Chpte 7 Kleene s Theoem 7.1 Kleene s Theoem The following theoem is the most impotnt nd fundmentl esult in the theoy of FA s: Theoem 6 Any lnguge tht cn e defined y eithe egul expession, o finite utomt,

More information

The Area of a Triangle

The Area of a Triangle The e of Tingle tkhlid June 1, 015 1 Intodution In this tile we will e disussing the vious methods used fo detemining the e of tingle. Let [X] denote the e of X. Using se nd Height To stt off, the simplest

More information

Andersen s Algorithm. CS 701 Final Exam (Reminder) Friday, December 12, 4:00 6:00 P.M., 1289 Computer Science.

Andersen s Algorithm. CS 701 Final Exam (Reminder) Friday, December 12, 4:00 6:00 P.M., 1289 Computer Science. CS 701 Finl Exm (Reminde) Fidy, Deeme 12, 4:00 6:00 P.M., 1289 Comute Siene. Andesen s Algoithm An lgoithm to uild oints-to gh fo C ogm is esented in: Pogm Anlysis nd Seiliztion fo the C ogmming Lnguge,

More information

Previously. Extensions to backstepping controller designs. Tracking using backstepping Suppose we consider the general system

Previously. Extensions to backstepping controller designs. Tracking using backstepping Suppose we consider the general system 436-459 Advnced contol nd utomtion Extensions to bckstepping contolle designs Tcking Obseves (nonline dmping) Peviously Lst lectue we looked t designing nonline contolles using the bckstepping technique

More information

Data Structures. Element Uniqueness Problem. Hash Tables. Example. Hash Tables. Dana Shapira. 19 x 1. ) h(x 4. ) h(x 2. ) h(x 3. h(x 1. x 4. x 2.

Data Structures. Element Uniqueness Problem. Hash Tables. Example. Hash Tables. Dana Shapira. 19 x 1. ) h(x 4. ) h(x 2. ) h(x 3. h(x 1. x 4. x 2. Element Uniqueness Poblem Dt Stuctues Let x,..., xn < m Detemine whethe thee exist i j such tht x i =x j Sot Algoithm Bucket Sot Dn Shpi Hsh Tbles fo (i=;i

More information

Topic II.1: Frequent Subgraph Mining

Topic II.1: Frequent Subgraph Mining Topi II.1: Fequent Sugph Mining Disete Topis in Dt Mining Univesität des Slndes, Süken Winte Semeste 2012/13 T II.1-1 TII.1: Fequent Sugph Mining 1. Definitions nd Polems 1.1. Gph Isomophism 2. Apioi-Bsed

More information

Week 8. Topic 2 Properties of Logarithms

Week 8. Topic 2 Properties of Logarithms Week 8 Topic 2 Popeties of Logithms 1 Week 8 Topic 2 Popeties of Logithms Intoduction Since the esult of ithm is n eponent, we hve mny popeties of ithms tht e elted to the popeties of eponents. They e

More information

Module 4: Moral Hazard - Linear Contracts

Module 4: Moral Hazard - Linear Contracts Module 4: Mol Hzd - Line Contts Infomtion Eonomis (E 55) Geoge Geogidis A pinipl employs n gent. Timing:. The pinipl o es line ontt of the fom w (q) = + q. is the sly, is the bonus te.. The gent hooses

More information

Mathematical Reflections, Issue 5, INEQUALITIES ON RATIOS OF RADII OF TANGENT CIRCLES. Y.N. Aliyev

Mathematical Reflections, Issue 5, INEQUALITIES ON RATIOS OF RADII OF TANGENT CIRCLES. Y.N. Aliyev themtil efletions, Issue 5, 015 INEQULITIES ON TIOS OF DII OF TNGENT ILES YN liev stt Some inequlities involving tios of dii of intenll tngent iles whih inteset the given line in fied points e studied

More information

Optimization. x = 22 corresponds to local maximum by second derivative test

Optimization. x = 22 corresponds to local maximum by second derivative test Optimiztion Lectue 17 discussed the exteme vlues of functions. This lectue will pply the lesson fom Lectue 17 to wod poblems. In this section, it is impotnt to emembe we e in Clculus I nd e deling one-vible

More information

Illustrating the space-time coordinates of the events associated with the apparent and the actual position of a light source

Illustrating the space-time coordinates of the events associated with the apparent and the actual position of a light source Illustting the spe-time oointes of the events ssoite with the ppent n the tul position of light soue Benh Rothenstein ), Stefn Popesu ) n Geoge J. Spi 3) ) Politehni Univesity of Timiso, Physis Deptment,

More information

10.3 The Quadratic Formula

10.3 The Quadratic Formula . Te Qudti Fomul We mentioned in te lst setion tt ompleting te sque n e used to solve ny qudti eqution. So we n use it to solve 0. We poeed s follows 0 0 Te lst line of tis we ll te qudti fomul. Te Qudti

More information

Swinburne Research Bank

Swinburne Research Bank Swinune Reseh Bnk http://esehnk.swinune.edu.u Zhou, R., Liu, C., Wng, J., & Li, J. (2009). Continment etween unions of XPth queies. Oiginlly pulished in X. Zhou, H. Yokot, K. Deng, & Q. Liu (eds.) Poeedings

More information

Validating XML Documents in the Streaming Model with External Memory

Validating XML Documents in the Streaming Model with External Memory Vlidting XML Douments in the Steming Model with Extenl Memoy Chistin Kond LIAFA, Univ. Pis Dideot; Pis, Fne; nd Univ. Pis-Sud; Osy, Fne. kond@li.f Fédéi Mgniez LIAFA, Univ. Pis Dideot, CNRS; Pis, Fne.

More information

General Physics II. number of field lines/area. for whole surface: for continuous surface is a whole surface

General Physics II. number of field lines/area. for whole surface: for continuous surface is a whole surface Genel Physics II Chpte 3: Guss w We now wnt to quickly discuss one of the moe useful tools fo clculting the electic field, nmely Guss lw. In ode to undestnd Guss s lw, it seems we need to know the concept

More information

Lecture 10. Solution of Nonlinear Equations - II

Lecture 10. Solution of Nonlinear Equations - II Fied point Poblems Lectue Solution o Nonline Equtions - II Given unction g : R R, vlue such tht gis clled ied point o the unction g, since is unchnged when g is pplied to it. Whees with nonline eqution

More information

Radial geodesics in Schwarzschild spacetime

Radial geodesics in Schwarzschild spacetime Rdil geodesics in Schwzschild spcetime Spheiclly symmetic solutions to the Einstein eqution tke the fom ds dt d dθ sin θdϕ whee is constnt. We lso hve the connection components, which now tke the fom using

More information

CHAPTER 7 Applications of Integration

CHAPTER 7 Applications of Integration CHAPTER 7 Applitions of Integtion Setion 7. Ae of Region Between Two Cuves.......... Setion 7. Volume: The Disk Method................. Setion 7. Volume: The Shell Method................ Setion 7. A Length

More information

This immediately suggests an inverse-square law for a "piece" of current along the line.

This immediately suggests an inverse-square law for a piece of current along the line. Electomgnetic Theoy (EMT) Pof Rui, UNC Asheville, doctophys on YouTube Chpte T Notes The iot-svt Lw T nvese-sque Lw fo Mgnetism Compe the mgnitude of the electic field t distnce wy fom n infinite line

More information

SPA7010U/SPA7010P: THE GALAXY. Solutions for Coursework 1. Questions distributed on: 25 January 2018.

SPA7010U/SPA7010P: THE GALAXY. Solutions for Coursework 1. Questions distributed on: 25 January 2018. SPA7U/SPA7P: THE GALAXY Solutions fo Cousewok Questions distibuted on: 25 Jnuy 28. Solution. Assessed question] We e told tht this is fint glxy, so essentilly we hve to ty to clssify it bsed on its spectl

More information

Project 6: Minigoals Towards Simplifying and Rewriting Expressions

Project 6: Minigoals Towards Simplifying and Rewriting Expressions MAT 51 Wldis Projet 6: Minigols Towrds Simplifying nd Rewriting Expressions The distriutive property nd like terms You hve proly lerned in previous lsses out dding like terms ut one prolem with the wy

More information

Data Compression LZ77. Jens Müller Universität Stuttgart

Data Compression LZ77. Jens Müller Universität Stuttgart Dt Compession LZ77 Jens Mülle Univesität Stuttgt 2008-11-25 Outline Intoution Piniple of itiony methos LZ77 Sliing winow Exmples Optimiztion Pefomne ompison Applitions/Ptents Jens Mülle- IPVS Univesität

More information

An Analysis of the LRE-Algorithm using Sojourn Times

An Analysis of the LRE-Algorithm using Sojourn Times An Anlysis of the LRE-Algoithm using Sooun Times Nobet Th. Mülle Abteilung Infomtik Univesität Tie D-5486 Tie, Gemny E-mil: muelle@uni-tie.de Tel: ++49-65-0-845 Fx: ++49-65-0-3805 KEYWORDS Disete event

More information

10 Statistical Distributions Solutions

10 Statistical Distributions Solutions Communictions Engineeing MSc - Peliminy Reding 1 Sttisticl Distiutions Solutions 1) Pove tht the vince of unifom distiution with minimum vlue nd mximum vlue ( is ) 1. The vince is the men of the sques

More information

On the Eötvös effect

On the Eötvös effect On the Eötvös effect Mugu B. Răuţ The im of this ppe is to popose new theoy bout the Eötvös effect. We develop mthemticl model which loud us bette undestnding of this effect. Fom the eqution of motion

More information

Answers to test yourself questions

Answers to test yourself questions Answes to test youself questions opic Descibing fields Gm Gm Gm Gm he net field t is: g ( d / ) ( 4d / ) d d Gm Gm Gm Gm Gm Gm b he net potentil t is: V d / 4d / d 4d d d V e 4 7 9 49 J kg 7 7 Gm d b E

More information

Topics for Review for Final Exam in Calculus 16A

Topics for Review for Final Exam in Calculus 16A Topics fo Review fo Finl Em in Clculus 16A Instucto: Zvezdelin Stnkov Contents 1. Definitions 1. Theoems nd Poblem Solving Techniques 1 3. Eecises to Review 5 4. Chet Sheet 5 1. Definitions Undestnd the

More information

Fluids & Bernoulli s Equation. Group Problems 9

Fluids & Bernoulli s Equation. Group Problems 9 Goup Poblems 9 Fluids & Benoulli s Eqution Nme This is moe tutoil-like thn poblem nd leds you though conceptul development of Benoulli s eqution using the ides of Newton s 2 nd lw nd enegy. You e going

More information

Prerna Tower, Road No 2, Contractors Area, Bistupur, Jamshedpur , Tel (0657) ,

Prerna Tower, Road No 2, Contractors Area, Bistupur, Jamshedpur , Tel (0657) , R Pen Towe Rod No Conttos Ae Bistupu Jmshedpu 8 Tel (67)89 www.penlsses.om IIT JEE themtis Ppe II PART III ATHEATICS SECTION I (Totl ks : ) (Single Coet Answe Type) This setion ontins 8 multiple hoie questions.

More information

1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides.

1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides. 1 PYTHAGORAS THEOREM 1 1 Pythgors Theorem In this setion we will present geometri proof of the fmous theorem of Pythgors. Given right ngled tringle, the squre of the hypotenuse is equl to the sum of the

More information

INTEGRATION. 1 Integrals of Complex Valued functions of a REAL variable

INTEGRATION. 1 Integrals of Complex Valued functions of a REAL variable INTEGRATION NOTE: These notes re supposed to supplement Chpter 4 of the online textbook. 1 Integrls of Complex Vlued funtions of REAL vrible If I is n intervl in R (for exmple I = [, b] or I = (, b)) nd

More information

Math 4318 : Real Analysis II Mid-Term Exam 1 14 February 2013

Math 4318 : Real Analysis II Mid-Term Exam 1 14 February 2013 Mth 4318 : Rel Anlysis II Mid-Tem Exm 1 14 Febuy 2013 Nme: Definitions: Tue/Flse: Poofs: 1. 2. 3. 4. 5. 6. Totl: Definitions nd Sttements of Theoems 1. (2 points) Fo function f(x) defined on (, b) nd fo

More information

Analysis of Variance for Multiple Factors

Analysis of Variance for Multiple Factors Multiple Fto ANOVA Notes Pge wo Fto Anlsis Anlsis of Vine fo Multiple Ftos Conside two ftos (tetments) A nd B with A done t levels nd B done t levels. Within given tetment omintion of A nd B levels, leled

More information

Deterministic simulation of a NFA with k symbol lookahead

Deterministic simulation of a NFA with k symbol lookahead Deteministic simultion of NFA with k symbol lookhed SOFSEM 7 Bl Rvikum, Clifoni Stte Univesity (joint wok with Nic Snten, Univesity of Wteloo) Oveview Definitions: DFA, NFA nd lookhed DFA Motivtion: utomted

More information

NS-IBTS indices calculation procedure

NS-IBTS indices calculation procedure ICES Dt Cente DATRAS 1.1 NS-IBTS indices 2013 DATRAS Pocedue Document NS-IBTS indices clcultion pocedue Contents Genel... 2 I Rw ge dt CA -> Age-length key by RFA fo defined ge nge ALK... 4 II Rw length

More information

CS 573 Automata Theory and Formal Languages

CS 573 Automata Theory and Formal Languages Non-determinism Automt Theory nd Forml Lnguges Professor Leslie Lnder Leture # 3 Septemer 6, 2 To hieve our gol, we need the onept of Non-deterministi Finite Automton with -moves (NFA) An NFA is tuple

More information

Important design issues and engineering applications of SDOF system Frequency response Functions

Important design issues and engineering applications of SDOF system Frequency response Functions Impotnt design issues nd engineeing pplictions of SDOF system Fequency esponse Functions The following desciptions show typicl questions elted to the design nd dynmic pefomnce of second-ode mechnicl system

More information

FI 2201 Electromagnetism

FI 2201 Electromagnetism FI 1 Electomgnetism Alexnde A. Isknd, Ph.D. Physics of Mgnetism nd Photonics Resech Goup Electosttics ELECTRIC PTENTIALS 1 Recll tht we e inteested to clculte the electic field of some chge distiution.

More information

CHAPTER 18: ELECTRIC CHARGE AND ELECTRIC FIELD

CHAPTER 18: ELECTRIC CHARGE AND ELECTRIC FIELD ollege Physics Student s Mnul hpte 8 HAPTR 8: LTRI HARG AD LTRI ILD 8. STATI LTRIITY AD HARG: OSRVATIO O HARG. ommon sttic electicity involves chges nging fom nnocoulombs to micocoulombs. () How mny electons

More information

r r E x w, y w, z w, (1) Where c is the speed of light in vacuum.

r r E x w, y w, z w, (1) Where c is the speed of light in vacuum. ISSN: 77-754 ISO 900:008 Cetified Intentionl Jonl of Engineeing nd Innovtive Tehnology (IJEIT) olme, Isse 0, Apil 04 The Replement of the Potentils s Conseene of the Limittions Set by the Lw of the Self

More information

9.4 The response of equilibrium to temperature (continued)

9.4 The response of equilibrium to temperature (continued) 9.4 The esponse of equilibium to tempetue (continued) In the lst lectue, we studied how the chemicl equilibium esponds to the vition of pessue nd tempetue. At the end, we deived the vn t off eqution: d

More information

18.06 Problem Set 4 Due Wednesday, Oct. 11, 2006 at 4:00 p.m. in 2-106

18.06 Problem Set 4 Due Wednesday, Oct. 11, 2006 at 4:00 p.m. in 2-106 8. Problem Set Due Wenesy, Ot., t : p.m. in - Problem Mony / Consier the eight vetors 5, 5, 5,..., () List ll of the one-element, linerly epenent sets forme from these. (b) Wht re the two-element, linerly

More information

Michael Rotkowitz 1,2

Michael Rotkowitz 1,2 Novembe 23, 2006 edited Line Contolles e Unifomly Optiml fo the Witsenhusen Counteexmple Michel Rotkowitz 1,2 IEEE Confeence on Decision nd Contol, 2006 Abstct In 1968, Witsenhusen intoduced his celebted

More information

Algebra Based Physics. Gravitational Force. PSI Honors universal gravitation presentation Update Fall 2016.notebookNovember 10, 2016

Algebra Based Physics. Gravitational Force. PSI Honors universal gravitation presentation Update Fall 2016.notebookNovember 10, 2016 Newton's Lw of Univesl Gvittion Gvittionl Foce lick on the topic to go to tht section Gvittionl Field lgeb sed Physics Newton's Lw of Univesl Gvittion Sufce Gvity Gvittionl Field in Spce Keple's Thid Lw

More information

A Bijective Approach to the Permutational Power of a Priority Queue

A Bijective Approach to the Permutational Power of a Priority Queue A Bijective Appoach to the Pemutational Powe of a Pioity Queue Ia M. Gessel Kuang-Yeh Wang Depatment of Mathematics Bandeis Univesity Waltham, MA 02254-9110 Abstact A pioity queue tansfoms an input pemutation

More information

EECE 260 Electrical Circuits Prof. Mark Fowler

EECE 260 Electrical Circuits Prof. Mark Fowler EECE 60 Electicl Cicuits Pof. Mk Fowle Complex Numbe Review /6 Complex Numbes Complex numbes ise s oots of polynomils. Definition of imginy # nd some esulting popeties: ( ( )( ) )( ) Recll tht the solution

More information

A Study of Some Integral Problems Using Maple

A Study of Some Integral Problems Using Maple Mthemtis n Sttistis (): -, 0 DOI: 0.89/ms.0.000 http://www.hpub.og A Stuy of Some Integl Poblems Ug Mple Chii-Huei Yu Deptment of Mngement n Infomtion, Nn Jeon Univesity of Siene n Tehnology, Tinn City,

More information

Equations from the Millennium Theory of Inertia and Gravity. Copyright 2004 Joseph A. Rybczyk

Equations from the Millennium Theory of Inertia and Gravity. Copyright 2004 Joseph A. Rybczyk Equtions fo the illenniu heoy of Ineti nd vity Copyight 004 Joseph A. Rybzyk ollowing is oplete list of ll of the equtions used o deived in the illenniu heoy of Ineti nd vity. o ese of efeene the equtions

More information

COMPARING MORE THAN TWO POPULATION MEANS: AN ANALYSIS OF VARIANCE

COMPARING MORE THAN TWO POPULATION MEANS: AN ANALYSIS OF VARIANCE COMPARING MORE THAN TWO POPULATION MEANS: AN ANALYSIS OF VARIANCE To see how the piniple behind the analysis of vaiane method woks, let us onside the following simple expeiment. The means ( 1 and ) of

More information

( ) D x ( s) if r s (3) ( ) (6) ( r) = d dr D x

( ) D x ( s) if r s (3) ( ) (6) ( r) = d dr D x SIO 22B, Rudnick dpted fom Dvis III. Single vile sttistics The next few lectues e intended s eview of fundmentl sttistics. The gol is to hve us ll speking the sme lnguge s we move to moe dvnced topics.

More information

Class Summary. be functions and f( D) , we define the composition of f with g, denoted g f by

Class Summary. be functions and f( D) , we define the composition of f with g, denoted g f by Clss Summy.5 Eponentil Functions.6 Invese Functions nd Logithms A function f is ule tht ssigns to ech element D ectly one element, clled f( ), in. Fo emple : function not function Given functions f, g:

More information

6.5 Improper integrals

6.5 Improper integrals Eerpt from "Clulus" 3 AoPS In. www.rtofprolemsolving.om 6.5. IMPROPER INTEGRALS 6.5 Improper integrls As we ve seen, we use the definite integrl R f to ompute the re of the region under the grph of y =

More information

Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18

Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18 Computt onl Biology Leture 18 Genome Rerrngements Finding preserved genes We hve seen before how to rerrnge genome to obtin nother one bsed on: Reversls Knowledge of preserved bloks (or genes) Now we re

More information

Multiplying and Dividing Rational Expressions

Multiplying and Dividing Rational Expressions Lesson Peview Pt - Wht You ll Len To multipl tionl epessions To divide tionl epessions nd Wh To find lon pments, s in Eecises 0 Multipling nd Dividing Rtionl Epessions Multipling Rtionl Epessions Check

More information

Arrow s Impossibility Theorem

Arrow s Impossibility Theorem Rep Fun Gme Properties Arrow s Theorem Arrow s Impossiility Theorem Leture 12 Arrow s Impossiility Theorem Leture 12, Slide 1 Rep Fun Gme Properties Arrow s Theorem Leture Overview 1 Rep 2 Fun Gme 3 Properties

More information

School of Electrical and Computer Engineering, Cornell University. ECE 303: Electromagnetic Fields and Waves. Fall 2007

School of Electrical and Computer Engineering, Cornell University. ECE 303: Electromagnetic Fields and Waves. Fall 2007 School of Electicl nd Compute Engineeing, Conell Univesity ECE 303: Electomgnetic Fields nd Wves Fll 007 Homewok 3 Due on Sep. 14, 007 by 5:00 PM Reding Assignments: i) Review the lectue notes. ii) Relevnt

More information

The Formulas of Vector Calculus John Cullinan

The Formulas of Vector Calculus John Cullinan The Fomuls of Vecto lculus John ullinn Anlytic Geomety A vecto v is n n-tuple of el numbes: v = (v 1,..., v n ). Given two vectos v, w n, ddition nd multipliction with scl t e defined by Hee is bief list

More information

Influence of the Magnetic Field in the Solar Interior on the Differential Rotation

Influence of the Magnetic Field in the Solar Interior on the Differential Rotation Influene of the gneti Fiel in the Sol Inteio on the Diffeentil ottion Lin-Sen Li * Deptment of Physis Nothest Noml Univesity Chnghun Chin * Coesponing utho: Lin-Sen Li Deptment of Physis Nothest Noml Univesity

More information

Chapter Seven Notes N P U1C7

Chapter Seven Notes N P U1C7 Chpte Seven Notes N P UC7 Nme Peiod Setion 7.: Angles nd Thei Mesue In fling, hitetue, nd multitude of othe fields, ngles e used. An ngle is two diffeent s tht hve the sme initil (o stting) point. The

More information

Probabilistic Retrieval

Probabilistic Retrieval CS 630 Lectue 4: 02/07/2006 Lectue: Lillin Lee Scibes: Pete Bbinski, Dvid Lin Pobbilistic Retievl I. Nïve Beginnings. Motivtions b. Flse Stt : A Pobbilistic Model without Vition? II. Fomultion. Tems nd

More information

Photographing a time interval

Photographing a time interval Potogaping a time inteval Benad Rotenstein and Ioan Damian Politennia Univesity of imisoaa Depatment of Pysis imisoaa Romania benad_otenstein@yaoo.om ijdamian@yaoo.om Abstat A metod of measuing time intevals

More information

Chapter 4. Sampling of Continuous-Time Signals

Chapter 4. Sampling of Continuous-Time Signals Chapte 4 Sampling of Continuous-Time Signals 1 Intodution Disete-time signals most ommonly ou as epesentations of sampled ontinuous-time signals. Unde easonable onstaints, a ontinuous-time signal an be

More information

Mark Scheme (Results) January 2008

Mark Scheme (Results) January 2008 Mk Scheme (Results) Jnuy 00 GCE GCE Mthemtics (6679/0) Edecel Limited. Registeed in Englnd nd Wles No. 4496750 Registeed Office: One90 High Holbon, London WCV 7BH Jnuy 00 6679 Mechnics M Mk Scheme Question

More information

Tutorial Worksheet. 1. Find all solutions to the linear system by following the given steps. x + 2y + 3z = 2 2x + 3y + z = 4.

Tutorial Worksheet. 1. Find all solutions to the linear system by following the given steps. x + 2y + 3z = 2 2x + 3y + z = 4. Mth 5 Tutoril Week 1 - Jnury 1 1 Nme Setion Tutoril Worksheet 1. Find ll solutions to the liner system by following the given steps x + y + z = x + y + z = 4. y + z = Step 1. Write down the rgumented mtrix

More information

2-Way Finite Automata Radboud University, Nijmegen. Writer: Serena Rietbergen, s Supervisor: Herman Geuvers

2-Way Finite Automata Radboud University, Nijmegen. Writer: Serena Rietbergen, s Supervisor: Herman Geuvers 2-Wy Finite Automt Rdoud Univesity, Nijmegen Wite: Seen Rietegen, s4182804 Supeviso: Hemn Geuves Acdemic Ye 2017-2018 Contents 1 Intoduction 3 2 One wy utomt, deteministic nd non-deteministic 5 3 Ovehed

More information

Arrow s Impossibility Theorem

Arrow s Impossibility Theorem Rep Voting Prdoxes Properties Arrow s Theorem Arrow s Impossiility Theorem Leture 12 Arrow s Impossiility Theorem Leture 12, Slide 1 Rep Voting Prdoxes Properties Arrow s Theorem Leture Overview 1 Rep

More information

U>, and is negative. Electric Potential Energy

U>, and is negative. Electric Potential Energy Electic Potentil Enegy Think of gvittionl potentil enegy. When the lock is moved veticlly up ginst gvity, the gvittionl foce does negtive wok (you do positive wok), nd the potentil enegy (U) inceses. When

More information

Math 32B Discussion Session Week 8 Notes February 28 and March 2, f(b) f(a) = f (t)dt (1)

Math 32B Discussion Session Week 8 Notes February 28 and March 2, f(b) f(a) = f (t)dt (1) Green s Theorem Mth 3B isussion Session Week 8 Notes Februry 8 nd Mrh, 7 Very shortly fter you lerned how to integrte single-vrible funtions, you lerned the Fundmentl Theorem of lulus the wy most integrtion

More information

Physics 217 Practice Final Exam: Solutions

Physics 217 Practice Final Exam: Solutions Physis 17 Ptie Finl Em: Solutions Fll This ws the Physis 17 finl em in Fll 199 Twenty-thee students took the em The vege soe ws 11 out of 15 (731%), nd the stndd devition 9 The high nd low soes wee 145

More information

MAT 403 NOTES 4. f + f =

MAT 403 NOTES 4. f + f = MAT 403 NOTES 4 1. Fundmentl Theorem o Clulus We will proo more generl version o the FTC thn the textook. But just like the textook, we strt with the ollowing proposition. Let R[, ] e the set o Riemnn

More information

Chapter Introduction to Partial Differential Equations

Chapter Introduction to Partial Differential Equations hpte 10.01 Intodtion to Ptil Diffeentil Eqtions Afte eding this hpte o shold be ble to: 1. identif the diffeene between odin nd ptil diffeentil eqtions.. identif diffeent tpes of ptil diffeentil eqtions.

More information

10 m, so the distance from the Sun to the Moon during a solar eclipse is. The mass of the Sun, Earth, and Moon are = =

10 m, so the distance from the Sun to the Moon during a solar eclipse is. The mass of the Sun, Earth, and Moon are = = Chpte 1 nivesl Gvittion 11 *P1. () The un-th distnce is 1.4 nd the th-moon 8 distnce is.84, so the distnce fom the un to the Moon duing sol eclipse is 11 8 11 1.4.84 = 1.4 The mss of the un, th, nd Moon

More information

Tests for Correlation on Bivariate Non-Normal Data

Tests for Correlation on Bivariate Non-Normal Data Jounl of Moden Applied Sttisticl Methods Volume 0 Issue Aticle 9 --0 Tests fo Coeltion on Bivite Non-Noml Dt L. Bevesdof Noth Colin Stte Univesity, lounneb@gmil.com Ping S Univesity of Noth Floid, ps@unf.edu

More information

Fourier-Bessel Expansions with Arbitrary Radial Boundaries

Fourier-Bessel Expansions with Arbitrary Radial Boundaries Applied Mthemtics,,, - doi:./m.. Pulished Online My (http://www.scirp.og/jounl/m) Astct Fouie-Bessel Expnsions with Aity Rdil Boundies Muhmmd A. Mushef P. O. Box, Jeddh, Sudi Ai E-mil: mmushef@yhoo.co.uk

More information

More Properties of the Riemann Integral

More Properties of the Riemann Integral More Properties of the Riemnn Integrl Jmes K. Peterson Deprtment of Biologil Sienes nd Deprtment of Mthemtil Sienes Clemson University Februry 15, 2018 Outline More Riemnn Integrl Properties The Fundmentl

More information

ITI Introduction to Computing II

ITI Introduction to Computing II ITI 1121. Intoduction to Computing II Mcel Tucotte School of Electicl Engineeing nd Compute Science Abstct dt type: Stck Stck-bsed lgoithms Vesion of Febuy 2, 2013 Abstct These lectue notes e ment to be

More information

SAMPLE LABORATORY SESSION FOR JAVA MODULE B. Calculations for Sample Cross-Section 2

SAMPLE LABORATORY SESSION FOR JAVA MODULE B. Calculations for Sample Cross-Section 2 SAMPLE LABORATORY SESSION FOR JAVA MODULE B Calulations fo Sample Coss-Setion. Use Input. Setion Popeties The popeties of Sample Coss-Setion ae shown in Figue and ae summaized below. Figue : Popeties of

More information

Electronic Supplementary Material

Electronic Supplementary Material Electonic Supplementy Mteil On the coevolution of socil esponsiveness nd behvioul consistency Mx Wolf, G Snde vn Doon & Fnz J Weissing Poc R Soc B 78, 440-448; 0 Bsic set-up of the model Conside the model

More information

Language Processors F29LP2, Lecture 5

Language Processors F29LP2, Lecture 5 Lnguge Pocessos F29LP2, Lectue 5 Jmie Gy Feuy 2, 2014 1 / 1 Nondeteministic Finite Automt (NFA) NFA genelise deteministic finite utomt (DFA). They llow sevel (0, 1, o moe thn 1) outgoing tnsitions with

More information

TOPIC: LINEAR ALGEBRA MATRICES

TOPIC: LINEAR ALGEBRA MATRICES Interntionl Blurete LECTUE NOTES for FUTHE MATHEMATICS Dr TOPIC: LINEA ALGEBA MATICES. DEFINITION OF A MATIX MATIX OPEATIONS.. THE DETEMINANT deta THE INVESE A -... SYSTEMS OF LINEA EQUATIONS. 8. THE AUGMENTED

More information

(a) A partition P of [a, b] is a finite subset of [a, b] containing a and b. If Q is another partition and P Q, then Q is a refinement of P.

(a) A partition P of [a, b] is a finite subset of [a, b] containing a and b. If Q is another partition and P Q, then Q is a refinement of P. Chpter 7: The Riemnn Integrl When the derivtive is introdued, it is not hrd to see tht the it of the differene quotient should be equl to the slope of the tngent line, or when the horizontl xis is time

More information

Suppose you have a bank account that earns interest at rate r, and you have made an initial deposit of X 0

Suppose you have a bank account that earns interest at rate r, and you have made an initial deposit of X 0 IOECONOMIC MODEL OF A FISHERY (ontinued) Dynami Maximum Eonomi Yield In ou deivation of maximum eonomi yield (MEY) we examined a system at equilibium and ou analysis made no distintion between pofits in

More information

NON-DETERMINISTIC FSA

NON-DETERMINISTIC FSA Tw o types of non-determinism: NON-DETERMINISTIC FS () Multiple strt-sttes; strt-sttes S Q. The lnguge L(M) ={x:x tkes M from some strt-stte to some finl-stte nd ll of x is proessed}. The string x = is

More information

Finite State Automata and Determinisation

Finite State Automata and Determinisation Finite Stte Automt nd Deterministion Tim Dworn Jnury, 2016 Lnguges fs nf re df Deterministion 2 Outline 1 Lnguges 2 Finite Stte Automt (fs) 3 Non-deterministi Finite Stte Automt (nf) 4 Regulr Expressions

More information

Chapter Direct Method of Interpolation More Examples Mechanical Engineering

Chapter Direct Method of Interpolation More Examples Mechanical Engineering Chpte 5 iect Method o Intepoltion Moe Exmples Mechnicl Engineeing Exmple Fo the pupose o shinking tunnion into hub, the eduction o dimete o tunnion sht by cooling it though tempetue chnge o is given by

More information

Core 2 Logarithms and exponentials. Section 1: Introduction to logarithms

Core 2 Logarithms and exponentials. Section 1: Introduction to logarithms Core Logrithms nd eponentils Setion : Introdution to logrithms Notes nd Emples These notes ontin subsetions on Indies nd logrithms The lws of logrithms Eponentil funtions This is n emple resoure from MEI

More information

π,π is the angle FROM a! TO b

π,π is the angle FROM a! TO b Mth 151: 1.2 The Dot Poduct We hve scled vectos (o, multiplied vectos y el nume clled scl) nd dded vectos (in ectngul component fom). Cn we multiply vectos togethe? The nswe is YES! In fct, thee e two

More information

School of Electrical and Computer Engineering, Cornell University. ECE 303: Electromagnetic Fields and Waves. Fall 2007

School of Electrical and Computer Engineering, Cornell University. ECE 303: Electromagnetic Fields and Waves. Fall 2007 School of Electicl nd Compute Engineeing, Conell Univesity ECE 303: Electomgnetic Fields nd Wves Fll 007 Homewok 4 Due on Sep. 1, 007 by 5:00 PM Reding Assignments: i) Review the lectue notes. ii) Relevnt

More information

Week 10: DTMC Applications Ranking Web Pages & Slotted ALOHA. Network Performance 10-1

Week 10: DTMC Applications Ranking Web Pages & Slotted ALOHA. Network Performance 10-1 Week : DTMC Alictions Rnking Web ges & Slotted ALOHA etwok efonce - Outline Aly the theoy of discete tie Mkov chins: Google s nking of web-ges Wht ge is the use ost likely seching fo? Foulte web-gh s Mkov

More information

1 Using Integration to Find Arc Lengths and Surface Areas

1 Using Integration to Find Arc Lengths and Surface Areas Novembe 9, 8 MAT86 Week Justin Ko Using Integtion to Find Ac Lengths nd Sufce Aes. Ac Length Fomul: If f () is continuous on [, b], then the c length of the cuve = f() on the intevl [, b] is given b s

More information

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points:

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points: Eidgenössishe Tehnishe Hohshule Zürih Eole polytehnique fédérle de Zurih Politenio federle di Zurigo Federl Institute of Tehnology t Zurih Deprtement of Computer Siene. Novemer 0 Mrkus Püshel, Dvid Steurer

More information

Chapter 4 State-Space Planning

Chapter 4 State-Space Planning Leture slides for Automted Plnning: Theory nd Prtie Chpter 4 Stte-Spe Plnning Dn S. Nu CMSC 722, AI Plnning University of Mrylnd, Spring 2008 1 Motivtion Nerly ll plnning proedures re serh proedures Different

More information

Summary: Binomial Expansion...! r. where

Summary: Binomial Expansion...! r. where Summy: Biomil Epsio 009 M Teo www.techmejcmth-sg.wes.com ) Re-cp of Additiol Mthemtics Biomil Theoem... whee )!!(! () The fomul is ville i MF so studets do ot eed to memoise it. () The fomul pplies oly

More information

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6 CS311 Computtionl Strutures Regulr Lnguges nd Regulr Grmmrs Leture 6 1 Wht we know so fr: RLs re losed under produt, union nd * Every RL n e written s RE, nd every RE represents RL Every RL n e reognized

More information

Exploration of the three-person duel

Exploration of the three-person duel Exploation of the thee-peson duel Andy Paish 15 August 2006 1 The duel Pictue a duel: two shootes facing one anothe, taking tuns fiing at one anothe, each with a fixed pobability of hitting his opponent.

More information

Green s Theorem. (2x e y ) da. (2x e y ) dx dy. x 2 xe y. (1 e y ) dy. y=1. = y e y. y=0. = 2 e

Green s Theorem. (2x e y ) da. (2x e y ) dx dy. x 2 xe y. (1 e y ) dy. y=1. = y e y. y=0. = 2 e Green s Theorem. Let be the boundry of the unit squre, y, oriented ounterlokwise, nd let F be the vetor field F, y e y +, 2 y. Find F d r. Solution. Let s write P, y e y + nd Q, y 2 y, so tht F P, Q. Let

More information

FEATURE-BASED CRYSTAL CONSTRUCTION IN COMPUTER-AIDED NANO- DESIGN

FEATURE-BASED CRYSTAL CONSTRUCTION IN COMPUTER-AIDED NANO- DESIGN Poeedings of IDET/IE 008 SME 008 Intentionl Design Engineeing Tehnil onfeenes & omputes nd Infomtion in Engineeing onfeene ugust 3-6 008 New Yok ity NY US DET008-49650 FETURE-SED RYSTL ONSTRUTION IN OMPUTER-IDED

More information

Exercise sheet 6: Solutions

Exercise sheet 6: Solutions Eerise sheet 6: Solutions Cvet emptor: These re merel etended hints, rther thn omplete solutions. 1. If grph G hs hromti numer k > 1, prove tht its verte set n e prtitioned into two nonempt sets V 1 nd

More information

Bisimulation, Games & Hennessy Milner logic

Bisimulation, Games & Hennessy Milner logic Bisimultion, Gmes & Hennessy Milner logi Leture 1 of Modelli Mtemtii dei Proessi Conorrenti Pweł Soboiński Univeristy of Southmpton, UK Bisimultion, Gmes & Hennessy Milner logi p.1/32 Clssil lnguge theory

More information

where the box contains a finite number of gates from the given collection. Examples of gates that are commonly used are the following: a b

where the box contains a finite number of gates from the given collection. Examples of gates that are commonly used are the following: a b CS 294-2 9/11/04 Quntum Ciruit Model, Solovy-Kitev Theorem, BQP Fll 2004 Leture 4 1 Quntum Ciruit Model 1.1 Clssil Ciruits - Universl Gte Sets A lssil iruit implements multi-output oolen funtion f : {0,1}

More information