Similarity Queries: Their Conceptual Evaluation, Transformations, and Processing

Size: px
Start display at page:

Download "Similarity Queries: Their Conceptual Evaluation, Transformations, and Processing"

Transcription

1 Nonam manuscript No. (will b insrtd by th ditor) imilarity Quris: Thir Concptual valuation, Transformations, and Procssing Yasin N. ilva Walid G. Arf Pr-Ak Larson pncr. Parson Mohamd H. Ali Rcivd: dat / Accptd: dat Abstract Many application scnarios can significantly bnfit from th idntification and procssing of similaritis in th data. vn though som work has bn don to xtnd th smantics of som oprators,.g., join and slction, to b awar of data similaritis; thr has not bn much study on th rol and implmntation of similarity-awar oprations as first-class databas oprators. urthrmor, vry littl work has addrssd th problm of valuating and optimizing quris that combin svral similarity oprations. Th focus of this papr is th study of similarity quris that contain on or multipl first-class similarity databas oprators,.g., imilarity lction, imilarity Join, and imilarity Group-by. Particularly, w analyz th implmntation tchniqus of svral similarity oprators; introduc a consistnt and comprhnsiv concptual valuation modl for similarity quris; and prsnt a rich st of transformation ruls to xtnd cost-basd qury optimization to th cas of similarity quris. Kywords similarity quris qury procssing qury transformations concptual valuation Y. N. ilva Arizona tat Univrsity, Arizona, UA -mail: ysilva@asu.du W. G. Arf Purdu Univrsity, Indiana, UA -mail: arf@cs.purdu.du P.-A. Larson Microsoft Rsarch, Rdmond, UA -mail: palarson@microsoft.com.. Parson Arizona tat Univrsity, Arizona, UA -mail: ssparso@asu.du M. H. Ali Microsoft Corporation, Rdmond, UA -mail: mali@microsoft.com 1 Introduction It is widly rcognizd that th mov from xact smantics of data and quris to imprcis and approximat smantics is on of th ky paradigm shifts in data managmnt. Many application scnarios,.g., markting analysis, snsor ntworks, and biological applications, can gratly bnfit from th idntification and procssing of similaritis in data. om tchniqus hav bn proposd to xtnd crtain data oprations,.g., join and slction, to mak us of data similaritis. Howvr, thr has not bn much work on th study of similarity-awar oprations as physical databas oprators. urthrmor, thr is vry littl work on th important problm of valuating and optimizing quris with multipl similarity oprations, i.., similarity quris. imilarity quris nabl answring mor complx and intrsting qustions lik th following (businss scnario): ind th closst thr supplirs for vry customr within 100 mils from our Chicago hadquartrs. Considring th customrs that ar locatd within 200 mils from our Chicago hadquartrs, clustr th customrs around crtain locations of intrst and rport th siz of ach clustr. or vry customr, idntify its closst 3 supplirs and for ach such supplir, idntify its closst 2 potntial nw supplirs. Th focus of this papr is th study of similarity quris with on or multipl physical similarity databas oprators. W dscrib svral similarity oprators and introduc a comprhnsiv concptual valuation modl for similarity quris. Morovr, w prsnt a rich st of transformation ruls that nabl cost-basd qury optimization of similarity quris.

2 2 Yasin N. ilva t al. This papr builds on two othr paprs [1,2]. Th work on ths prvious paprs focuss mainly on th indpndnt study of two similarity databas oprators: imilarity Group-by (GB) [1] and imilarity Join (J) [2]. Ths oprators wr also prsntd in two dmonstration paprs [3,4]. In this papr, w considr th fundamntal problms of th valuation and optimization of similarity quris with multipl similarity oprators. Th main contributions of this papr ar: W consolidat work on prviously proposd firstclass similarity databas oprators. W prsnt th imilarity Group-by and th imilarity Join oprators (ction 3.1), thir gnric dfinitions and multipl instancs. W prsnt th guidlins to implmnt ths oprators (ction 5) and th rsults of thir prformanc and scalability valuation. W introduc a comprhnsiv concptual valuation ordr for similarity quris with multipl similarity oprators (ction 3), i.., imilarity Groupby, imilarity Join, and imilarity lction. This valuation ordr spcifis a clar and consistnt way to xcut a similarity qury. It also spcifis unambiguously what th rsult of a similarity qury is, vn in th prsnc of various similarity oprators. W prsnt a rich st of quivalnc ruls to transform qury plans with multipl similarity oprators (ction 4). Th prsntd ruls can b usd to transform th concptual valuation plan into mor fficint quivalnt plans. Th prsntd ruls includ: (i) ruls to combin and sparat multipl similarity prdicats (ction 4.1); (ii) cor quivalnc ruls,.g., commutativity, distributivity, and associativity of similarity oprators (ction 4.2); and (iii) ruls that xploit intrsting proprtis of distanc functions to gnrat mor fficint plans (ction 4.3). W idntify svral ky gnral transformation guidlins for similarity qury optimization and show how multipl transformation ruls can b applid to transform complx similarity quris (ction 4.5). W valuat xprimntally th ffctivnss of svral proposd transformation ruls and show that thy can gnrat plans with xcution tims that ar only 10% to 70% of th ons of th initial qury plans (ction 6). Whil th xampls prsntd in this papr considr th cas of numric and vctor data, unlss othrwis statd, th dfinition of similarity oprators, th concptual valuation modl, and th quivalnc ruls prsntd in th papr ar applicabl to any data typ and distanc function. Th nw matrial is not only mor than 50% of this papr but also th focus of it. Th rst of th papr is organizd as follows. ction 2 dscribs rlatd work. ction 3 introducs th concptual valuation ordr for similarity quris. ction 4 prsnts transformation ruls for similarity quris. ction 5 prsnts th implmntation guidlins of similarity oprators. Th prformanc valuation of th implmntd oprators and th valuation of th ffctivnss of transformation ruls ar studid in ction 6. ction 7 prsnts th conclusions and futur rsarch dirctions. 2 Rlatd Work Clustring, on of th oldst similarity-awar oprations, has bn studid xtnsivly,.g., in pattrn rcognition, biology, statistics, and data mining. Of spcial intrst is th work on clustring of larg datasts. CUR [5] and BIRCH [6] ar two clustring algorithms basd on sampling and summaris, rspctivly. Thy us only on pass ovr th data and hnc rduc notably th xcution tim of clustring. Howvr, thir xcution tims ar still significantly slowr than that of th standard group-by. Th main diffrncs btwn ths oprations and th imilarity Group-by oprators w prsnt ar: (i) th xcution tims of th imilarity Groupby oprators ar vry clos to that of th rgular groupby; (ii) imilarity Group-by oprators ar fully intgratd with th qury ngin allowing th dirct us of thir rsults in complx qury piplins for furthr analysis; and (iii) th computation of aggrgation functions in imilarity Group-by is intgratd in th grouping procss and considrs all th tupls in ach group, not a summary or a subst basd on sampling. vral clustring algorithms hav bn implmntd in data mining systms. In gnral, th us of clustring is via a complx data mining modl and th implmntation is not intgratd with th standard qury procssing ngin. Th work by Zhang and Huang [7] proposs som QL constructs to mak clustring facilitis availabl from QL in th contxt of spatial data. Basically, ths constructs act as wrapprs of convntional clustring algorithms but no furthr intgration with databas systms is studid. Li t al. [8] xtnd th group-by oprator to approximatly clustr th tupls in a pr-dfind numbr of clustrs. Thir framwork maks us of convntional clustring algorithms,.g., K-mans; and mploys summaris and bitmap indics to intgrat clustring and ranking into databas systms. Our study diffrs from th work by Li t al. in that (i) w focus on similarity grouping oprators without th tight coupling to ranking; (ii) our framwork dos not dpnd on costly convntional clustring algorithms, but rathr allows th spcification of th dsird grouping using dscriptiv proprtis such

3 imilarity Quris: Thir Concptual valuation, Transformations, and Procssing 3 as group siz and compactnss; and (iii) w considr optimization tchniqus for quris that combin imilarity Group-by and othr oprators. Prvious work on data rconciliation proposd QL xtnsions to support usr-dfind similarity functions for grouping purposs [9] and similarity grouping prdicats [10]. This prvious work focuss on string similarity and similarity prdicats to rconcil rcords. Although imilarity Group-by can b usd for this purpos, thy ar mor gnral and ar fully intgratd into th qury ngin. ignificant work has bn carrid out on th xtnsion of crtain common oprations, i.., Join and lction, to mak us of similaritis in th data. This work introducd th smantics of th xtndd oprations and proposd tchniqus to implmnt thm primarily as standalon oprations outsid of a Databas Managmnt ystm (DBM) qury ngin rathr than as physical databas oprators. vral typs of imilarity Join hav bn proposd in th litratur,.g., rang distanc join (rtrivs all pairs whos distancs ar smallr than a pr-dfind thrshold ε) [11], k-distanc join (rtrivs th k most-similar pairs) [12], and knnjoin (rtrivs, for ach tupl in on tabl, th k narstnighbors in th othr tabl) [13]. Also of importanc is th work on imilarity Join tchniqus that mak us of rlational databas tchnology [14, 15]. Ths tchniqus ar applicabl only to string or st-basd data. Th gnral approach pr-procsss th data and qury,.g., dcomposs data and qury strings into sts of grams (substrings of a string that ar usd as its signatur), and stors th rsults of this stag on sparat rlational tabls. Thn, th rsult of th imilarity Join can b obtaind using standard QL statmnts. A ky diffrnc btwn this work and ours is that w focus on studying th proprtis, optimization tchniqus,.g., qury transformation ruls; and implmntation tchniqus of svral typs of imilarity Join as databas oprators thmslvs rathr than studying th way a imilarity Join can b answrd using standard oprators. imilarity lction oprations can b sn as spcial cass of imilarity Joins with singl-tupl innr rlations. Among rcnt contributions on imilarity lction ar th study of fast indics and algorithms for st-basd imilarity lction using smantic proprtis for sarch spac pruning [16], a quantitativ cost-basd approach to build high-quality grams to support slction quris on strings [17], and dimnsionality rduction tchniqus to support similarity sarch using th arth Movr s Distanc [18]. Th work by Adali t al. [19] proposs an algbra for similarity quris and prsnts xtnsions of simpl algbra ruls to th cas of similarity oprators. A framwork for similarity qury optimization using simpl quivalnc ruls is prsntd by rrira t al. [20]. Ths two paprs do not considr imilarity Group-by or all th typs of imilarity Join w considr. Traina t al. [21] propos an xtnsion to th rlational algbra to support similarity prdicats combind using Boolan oprators. This work, howvr, dos not considr imilarity Join, imilarity Group-by, and quris that combin non-similarity and similarity prdicats. Barioni t al. [22] propos QL syntax to xprss quris that us both non-similarity and similarity prdicats. Baioco t al. [23] prsnt a cost modl to stimat th numbr of I/O accsss and distanc calculations to answr similarity quris ovr data indxd using mtric accss mthods. Ths two paprs only considr ε-join and knn-joins. Th main diffrnc btwn th work in [19 22] and our work is that w prsnt a comprhnsiv modl to valuat quris with multipl similarity oprators (imilarity Group-by, imilarity Join, and imilarity lction), and a rich st of transformation ruls for quris with multipl non-similarity and similarity oprators. 3 Concptual valuation of imilarity Quris Many ral-world scnarios can bnfit from th support of quris with multipl similarity oprators. On of th cor lmnts to support gnric similarity quris is a concptual valuation ordr that clarly spcifis th xpctd rsults of a givn qury. Th concptual valuation ordr prsntd in this sction spcifis a clar and consistnt way to valuat quris with multipl similarity oprators. 3.1 upportd imilarity-awar Oprators Th imilarity Group-by Oprator (GB) imilarity Group-by is a physical databas oprator that xtnds th standard group-by to allow th formation of groups basd on similarity rathr than quality of th data. GB is a practical similarity grouping oprator that can b combind with othr oprators to fficintly answr similarity quris ndd in ral-world applications. Gnric Dfinition of imilarity Group-by W dfin th imilarity Group-by oprator as follows: (G 1, 1),...,(G n, n)γ 1(A 1),..., m(a m)(), whr is a rlation, G i is an attribut of usd to gnrat th groups, i.., similarity grouping attribut,

4 4 Yasin N. ilva t al Group 1 Group 2 Group 3 Group 4 Group 5 (a) LCT Max(Tmpratur), Avg(Tmpratur) ROM nsorsradings GROUP BY Tmpratur MAXIMUM_LMNT_PARATION 6 MAXIMUM_GROUP_DIAMTR Group 1 Group 2 (b) LCT Max(Tmpratur), Avg(Tmpratur) ROM nsorsradings GROUP BY Tmpratur AROUND {10,60} MAXIMUM_LMNT_PARATION 6 MAXIMUM_GROUP_DIAMTR Group 1 Group 2 Group 3 Group 4 Group 5 (c) LCT Max(Tmpratur), Avg(Tmpratur) ROM nsorsradings GROUP BY Tmpratur DLIMITD BY (LCT Valu ROM Thrsholds) ig. 1 Typs of imilarity Group-by i is a sgmntation of th domain of G i in non ovrlapping sgmnts, i is an aggrgation function, and A i is an attribut of. imilar to group-by, ach tupl that blongs to th rsult of GB rprsnts on group. W prsnt thr implmntabl instancs of th gnric GB. Thy rprsnt a middl ground btwn th rgular group-by and standard clustring algorithms. Th GB instancs ar intndd to b fastr than rgular clustring algorithms. Ths instancs gnrat groups that captur similaritis in th data not idntifid by group-by. Unsuprvisd imilarity Group-By (GB-U) This oprator groups a st of tupls in an unsuprvisd fashion, i.., with no xtra data tupls providd to guid th procss. GB-U is dfind only ovr 1D numric data and uss two clauss (group compactnss and group siz constraints) to form th groups: 1. MAXIMUM LMNT PARATION s: Th distanc btwn ach pair of adjacnt lmnts that blong to th sam group should b at most s. 2. MAXIMUM GROUP DIAMTR d: or ach group, th distanc btwn th two most sparatd lmnts in th group should b at most d. Th clauss can b combind using th AND oprator. Group formation starts from th tupl with th lowst grouping attribut valu. ig. 1.a givs an xampl of using GB-U with s = 6 and d = 20. Group 1 is composd of th rcords with valus 1 and 5. Whil this group could also contain valus 16 and 20 basd on d, thy form part of th scond group bcaus th distanc btwn 5 and 16 is gratr than s. uprvisd imilarity Group Around (GB-A) GB-A is dfind ovr data in a uclidan spac. This oprator groups tupls basd on a st of guiding points, namd cntral points, such that ach tupl is assignd to th group of its closst cntral point. Also, th siz and compactnss of th groups can b rstrictd by: 1. MAXIMUM LMNT PARATION s: or ach lmnt of a group, it is possibl to build a path from to th group s cntral point whr th lngth of vry link is at most s. 2. MAXIMUM GROUP DIAMTR 2r: Th distanc from ach lmnt to its cntral point is at most r. r rprsnts th maximum radius. Th cntral points can b spcifid using a list of points or by anothr slct statmnt. If a tupl is quidistant from multipl cntral points, th tupl is assignd to th group of th cntral point with th lowst lxicographical ordr. GB-A gnrats at most as many groups as cntral points ar providd and all th lmnts that do not blong to any group ar not considrd in th output. ig. 1.b givs an xampl of GB-A with s = 6, r = 10 and cntral points: 10 and 60. Group 1 is composd of valus 1, 5, 16 and 20. Whil this group can contain valu 24 basd on s, this valu dos not blong to th group bcaus th distanc btwn 24 and th group s cntral point (10) is gratr than r. uprvisd GB with Dlimitrs (GB-D) GB-D is dfind ovr data in a uclidan spac. GB-D forms groups basd on a st of dlimiting objcts (hyprplans: points in 1D, lins in 2D, tc.). To nsur a dtrministic bhavior, if a tupl lis on a dlimiting hyprplan spcifid by a 1 x 1 + a 2 x a n x n = b, th tupl blongs to th group that contains points in th rgion a 1 x 1 + a 2 x a n x n < b. ig. 1.c givs an xampl of GB-D with dlimiting points 10, 25, 45 and 60. Group 1 contains valus 1 and 5. An important proprty of all th prsntd oprators is that multipl xcutions of th oprators on th sam data st and sam rfrnc objcts,.g., cntral points and dlimiting objcts, will gnrat th sam rsults. In gnral, a qury can hav multipl similarity grouping attributs (GAs) and th sgmntation of ach GA can us a diffrnt similarity grouping instanc. In this cas, th rsult of GB is obtaind intrscting th sgmntations of all th (indpndnt) GAs. Th following xampl applis GB-A on attribut P rssur and GB-D on attribut T mpratur. LCT Avg(Tmpratur), Avg(Prssur) ROM nsorsradings GROUP BY Prssur AROUND {30,50} MAXIMUM_LMNT_PARATION 3, Tmpratur DLIMITD BY (LCT Val ROM Thrsholds); Th imilarity Join Oprator (J) imilarity Joins xtnd rgular joins to idntify tupls of similar rathr than qual valus. Js hav bn studid as ky oprations in multipl domains. Howvr,

5 imilarity Quris: Thir Concptual valuation, Transformations, and Procssing 5 thr has not bn much study on th rol and implmntation of Js as physical databas oprators. In this sction, w focus on th study of imilarity Joins as first-class databas oprators. Gnric Dfinition and our Instancs of imilarity Join Th gnric dfinition of th imilarity Join (J) oprator is as follows: θ (,f) = {, f θ (, f),, f }, whr θ rprsnts th imilarity Join prdicat, i.., th similarity-basd conditions that th pairs, f nd to satisfy to b in th output. Th J typs w considr ar prsntd nxt. Corrsponding QL syntax and xampls with numrical data ar prsntd in ig Rang Distanc Join (ε-join): θ ε (, f) dist(, f) ε. In th xampl in ig. 2.a, 4, 9 is on of th fiv pairs that blong to th output (dist(4, 9) 5). 2. k Narst Nighbor Join (knn-join): θ knn (, f) f is on of th k narst nighbors of. If a tupl in has lss than k nighbors in, th output should includ pairs for all xisting nighbors. Lt t b a tupl of and t on of th knn of t in. If thr ar othr tupls in with th sam distanc from t, th output should includ pairs for all such tupls. In ig. 2.b, valus 9 and 17 ar th two (k=2) narst nighbors of valu 4, thus 4, 9 and 4, 17 ar in th output. imilarly, 10, 22 and 42, ach hav two narst nighbors. 3. k Distanc Join (kd-join): θ kd (, f), f is on of th ovrall k-closst pairs. If th total numbr of possibl pairs is lss than k, th output should includ all th xisting pairs. If thr ar multipl pairs sparatd by th sam distanc and on of thm is includd in th output, thn all such pairs nd to b part of th output. In ig. 2.c, 10, 9 and 22, 24 ar th ovrall two closst pairs. 4. Join Around (Join-Around): θ A,MD=2r (, f) f is th closst nighbor of and dist(, f) r. Lt t b a tupl of and t th closst nighbor of t in, if thr ar othr tupls in with th sam distanc from t, th output should includ pairs for all such tupls. In ig. 2.d, 10, 9 is on of th thr pairs that blongs to th output (9 is th closst nighbor of 10 in B and dist(10, 9) 3). ε-join, knn-join, and kd-join ar common typs of J. W introduc Join-Around, a nw usful typ of J that combins som proprtis of ε-join and knn-join (k=1). vry valu of th first joind st is assignd to its closst valu in th scond st. Additionally, only th pairs sparatd by a distanc of at most r ar part of th join output. MD stands for Maximum Diamtr and r=m D/2 rprsnts th Maximum Radius A =5 B (a) -Join: LCT ROM A, B WHR A.a WITHIN O B.b A k=2 B (b) knn-join: LCT... ROM A, B WHR B.bkNART_NIGHBOR_O A.a A k=2 B (c) kd-join: LCT... ROM A, B WHR A.akTOP_CLOT_PAIR B.b A r = 3 B (d) Join-Around: LCT ROM A, B WHR A.a AROUND B.b [MAX_DIAMTR 2r] ig. 2 Typs of imilarity Join Th imilarity lction Oprator () imilarity lction oprators can b sn as spcial cass of th J oprators whr th innr input rlation consists of a singl tupl. Th rang distanc slction oprator is a spcial cas of th rang distanc join and th knn slction oprator is a spcial cas of th knn- Join. Th gnric dfinition of th imilarity lction oprator is as follows: σ θ () = { θ (), }, whr θ rprsnts th imilarity lction prdicat. This prdicat spcifis th similarity-basd conditions that tupl nds to satisfy to b in th imilarity lction output. Th imilarity lction prdicats for th imilarity lction oprators considrd in our study ar as follows. Lt C b a constant valu. 1. Rang Distanc lction (ε-lction): θ ε,c () dist(, C) ε. 2. knn-lction: θ knn,c () is a k-closst nighbor of C. If C has lss than k nighbors in, th output should includ all th xisting nighbors. If thr ar multipl tupls quidistant from C and on of thm is includd in th output, thn all such tupls nd to b part of th output. W rquir that all th rlations involvd in th k-basd oprations, i.., knn-join, kd-join, Join-A and knn- lction, hav a primary ky (PK). This allows th corrct computation of th rsults whn th rlations hav duplicats or hav bn combind with othr rlations, using only th valus of th attributs involvd in th oprations prdicats (and th rquird PKs). 3.2 Notation Usd in imilarity-awar xprssions Unlss othrwis statd, th xprssions in ctions 3 and 4 us th following notation:

6 6 Yasin N. ilva t al. 1. Rlations ar rprsntd with upprcas lttrs,.g.,,, and G. Th attributs of ths rlations ar rprsntd using th corrsponding lowrcas lttrs,.g.,, f, and g. Whn an xprssion rquirs multipl attributs of a givn rlation (), w us a numbr nxt to th bas nam,.g., 1, 2, tc. 2. imilarity and rgular (non-similarity) join prdicats ar spcifid using th xprssion θ (, f). and f ar th outr and innr join attributs, rspctivly. Whn an xprssion is applicabl to multipl typs of joins, th valu of is a gnral variabl,.g.,, 1, or 2. If an xprssion is applicabl to a particular typ of imilarity Join, th valu of can b: ε (ε-join), kn N (knn-join), A (Join-Around) or kd (kd-join). Rgular join uss a similar notation without th componnt. or xampl, th prdicat θ ε (, f) rprsnts an ε-join btwn rlations (outr) and (innr).. and.f ar th outr and innr join attributs, rspctivly. 3. imilarity and rgular slction prdicats ar spcifid using th xprssion θ,c (). is th slction attribut and C rfrs to th constant paramtr in th cas of. Whn an xprssion is applicabl to multipl typs of slction, th valu of is a gnral variabl,.g.,, 1, or 2. If an xprssion is applicabl to a particular typ of imilarity lction, th valu of can b: ε (ε-lction) or kn N (knn-lction). Rgular slction prdicats us th sam notation without and C. or xampl, th prdicat θ ε1, () rprsnts an ε-lction opration that slcts th tupls whr th valu of attribut. is within ε1 of th constant. 4. om gnric ruls hav prdicats that ar applicabl to both imilarity lction and imilarity Join oprations. In this cas, w us th notation θ, that can b instantiatd as θ,c () or θ (, f). Any constraints on th opration attributs ar dirctly spcifid on th ruls using this notation. 5. As in rgular rlational algbra (RA), a (similarity) join prdicat can b usd with th slction or join oprators in similarity xprssions. In rgular RA: σ θ(,f) ( ) θ(,f). Likwis, in similarity-awar RA: σ θ (,f)( ) θ (,f). W us similarity join prdicats with slction oprators in ruls that focus on th combination of multipl oprations,.g., and J. Th notation using a join oprator is usd in all othr cass. 6. W say that th attributs of an xprssion hav a singl dirction whn th xprssion is composd by join prdicats and thir attribut graph is of th form a 1 a 2 a n,.g., f g. Th attribut graph is built as follows. Th vrtics of th graph ar th join attributs and ach join is rpr- knn=4, () knn, (), () knn, () knn=4 valuating knn-lction first knn, (), () knn=4 valuating -lction first ig. 3 Diffrnt ways to combin ε-lction and knn- lction LCT [TOP k WITH TI] ListOfAttributs ROM 1,,n WHR RglPrd1 AND AND RglPrdp AND pslprd1 AND AND pslprdq AND knnlprd1 AND AND knnlprdr AND RgJoinPrd1 AND AND RgJoinPrds AND psjoinprd1 AND AND psjoinprdt AND knnjoinprd1 AND AND knnjoinprdu AND JoinArdPrd1 AND AND JoinArdPrdv AND kdjoinprd1 AND AND kdjoinprdw GROUP BY RgGA1,,RgGAx imgxp1,,imgxpy ORDR BY ortxpr RgGA1,,RgGAx imgxp1,,imgxpy RglPrd1 RglPrdp pslprd1 pslprdq knnlprd1 knnlprdr RgJoinPrd1 RgJoinPrds psjoinprd1 psjoinprdt knnjoinprd1 knnjoinprdu JoinArdPrd1 JoinArdPrdv kdjoinprd1 kdjoinprdw 1 TOP k ig. 4 Concptual valuation ordr of similarity quris GB sntd as a dirctd dg from th outr attribut (lft attribut of th join prdicat) to th innr on (right attribut of th join prdicat). 3.3 Concptual valuation Ordr of imilarity Quris In gnral, th ordr in which th oprations of a similarity qury ar valuatd affcts th rsults of a qury. or instanc, considr th lft hand sid (LH) plan of ig. 3. This plan shows a similarity qury with two imilarity lction prdicats (ε-lction and knn- lction). ig. 3 illustrats two ways in which this qury could b valuatd and th diffrnt rsults obtaind undr ach valuation. Th middl plan in th figur corrsponds to valuating first th knn-lction prdicat and applying th ε-lction ovr th output of th first oprator. Th right hand sid (RH) plan corrsponds to valuating first th ε-lction prdicat and thn th knn-lction. It is not clar which way this qury should b valuatd and without a clar concptual valuation ordr of similarity quris, multipl usrs may writ th sam qury xpcting diffrnt rsults.... n

7 imilarity Quris: Thir Concptual valuation, Transformations, and Procssing 7 Prd 1 Prd n Prd 1 Prd n... ig. 5 Combining multipl similarity-awar prdicats, () knn, () LCT ROM WHR pslprd() AND knnlprd(), () knn, () knn=4 Concptual valuation ig. 6 Using th concptual valuation ordr LCT, f, g ROM,, G WHR pslprd() AND knnlprd() AND psjoinprd(,f) AND knnjoinprd(f,g) 1, () knn1, () knn=4, () knn, () valuating knn-lction first 2 (,f) knn2 (f,g) x x G x x G x x G x x G ig. 7 Concptual valuation of a qury with multipl similarity prdicats ig. 4 prsnts th concptual valuation ordr for similarity quris. Th concptual qury plan maks us of a gnric similarity-slction nod that combins multipl and J prdicats using th convntional intrsction oprator as shown in ig. 5. Basd on th concptual valuation ordr prsntd in ig. 4, a gnric similarity-awar qury composd by multipl GB, J and oprators is valuatd as follows. At th bottom of th plan, all th rlations involvd in th qury gt combind using cross product. A gnric imilarity lction is valuatd aftr th cross product opration. This stp is quivalnt to intrscting th rsults of valuating indpndntly ach and J prdicat. Th rgular and similarity grouping oprations ar valuatd ovr th rsults of th slction nod. inally, an optional TOP oprator slcts th top K tupls using th ordr stablishd by ortxpr. Th prsntd concptual valuation ordr spcifis clarly th rsult of a similarity qury vn in th prsnc of multipl similarity-awar oprators. or xampl, ig. 6 shows how th qury rprsntd in th LH plan of ig. 3 is valuatd using th concptual valuation ordr. This figur also illustrats that th concptual valuation plan of this qury is quivalnt to valuating first th knn-lction oprator and applying th ε-lction on th rsults of th first oprator. W will study this and othr quivalnc ruls in ction 4. Not that th qury corrsponding to th othr ordr of xcution, i.., xcuting ε-lction bfor knn-lction, can b spcifid using a subqury: LCT ROM (LCT ROM WHR pslprd()) WHR knnlprd(); ig. 7 givs th concptual valuation plan of a qury with multipl similarity prdicats. 4 imilarity Qury Transformations imilar to convntional qury procssing, th concptual valuation of a similarity qury is not, in many cass, an fficint way to valuat th qury. Convntional databas systms oftn mak us of quivalnc ruls to transform a qury plan into quivalnt plans that gnrat th sam rsult. Cost-basd qury optimizrs comput th cost of ach plan and rturn th plan with th smallst cost for xcution. quivalnc ruls ar clarly a cor componnt of th optimization procss. A fundamntal qustion whn considring quris with multipl similarity oprators is how ths quris can b transformd. vn though similarity oprators hav bn xtnsivly studid, thr has not bn much study on th way quris with ths oprators can b transformd or optimizd. This sction prsnts a systmatic study of quivalnc ruls for similarity quris. Ths ruls allow th xtnsion of cost-basd optimization tchniqus to th cas of similarity quris. Th prsntd ruls allow also th transformation of a similarity qury from its concptual valuation plan into multipl quivalnt plans. This sction focuss on th prsntation of gnral ruls (GR) and th discussion of th applicability of ths ruls to spcific similarity oprators. Gnral ruls spcify both quivalncs and nonquivalncs. An xtnsiv list of quivalnc and nonquivalnc ruls (R), i.., all gnral rul instancs, is prsntd in Appndix A. This sction includs xampls basd on an xtnsion of th TPC-H bnchmark [24]. Additional tabls and attributs ar dscribd in th xampl quris. om xampls us location attributs (latitud/longitud). 4.1 Ruls to Combin/parat imilarity Prdicats This st of ruls can b usd to srializ multipl oprations involvd in a qury. or instanc, givn a similarity qury composd of two ε-lction prdicats applid ovr th sam attribut, th concptual valuation

8 8 Yasin N. ilva t al. -lction knn-lction Lgnd P1 P1 P1 P2 P2 P2 Prdicats of typs P1 and P2 can b combind or sparatd undr any xcution ordr of P1 and P2 Prdicats of typs P1 and P2 can b combind or sparatd only if P1 is xcutd bfor P2 Prdicats of typs P1 and P2 can not b combind or sparatd ig. 8 Possibl ways to combin and sparat prdicats , () 2, () 1, () 2, () ig. 9 Combining/sparating ε-l. and ε-l. (R1) plan will valuat ach prdicat sparatly. This valuation will rad and procss th input rlation twic and thn apply an intrsction opration ovr th intrmdiat rsults. Using th ruls of this subsction w ar abl to obtain an quivalnt plan that srializs both slction oprations. Th nw plan only rads from th input rlation onc to procss th first slction and prforms th scond on ovr th intrmdiat rsults. In all th ruls that allow th sparation, i.., srialization, of similarity prdicats w assum that th input rlation is composd by th cross product of all th rlations involvd in th similarity prdicats Combining/parating imilarity lction Prdicats Multipl prdicats can b combind or sparatd using th following gnral rul. GR1. σ θ1, () θ 2, ()() σ θ1, ()(σ θ2, ()()), if thr is a dirctd dg from 2 to 1 in ig. 8. Th graph in ig. 8 concisly rprsnts th way multipl prdicats can b combind. A similar notation is also usd in igurs 14 and 19. A doubly dirctd dg is a shorthand rprsntation of two dirctd dgs, on in ach dirction, btwn th connctd nods. Basd on GR1, th doubly dirctd dg that starts and nds at nod ε-lction mans that multipl ε-lction prdicats can b combind in any ordr, this is: σ θε1, () θ ε2, ()() σ θε1, ()(σ θε2, ()()) σ θε2, ()(σ θε1, ()()). (R1) Not that is commutativ. ig. 9 shows a graphical rprsntation and an xampl of R1. Th figur shows that th LH plan with th two combind ε-l. prdicats is quivalnt to th RH plan whr th two prdicats ar srializd. Also using GR1, th dirctd dg from knn-l. to ε-l. stats that prdicats of ths typs can b combind or sparatd only whn th knn-l. is xcutd first. This is: σ θε, () θ knn, ()() σ θε, ()(σ θknn, ()()). (R2) σ θknn, () θ ε, ()() σ θknn, ()(σ θε, ()()). (R3) knn2=4, () knn, (), () knn, () knn2=4 ig. 10 Combining/sparating ε-l. and knn-l. (R2) knn1=4 knn, (), () knn, (),() knn1=4 ig. 11 Combining/sparating knn-l. and ε-l. (R3) igur 10 rprsnts th two plans of R2. Ths plans ar quivalnt bcaus knn-l is xcutd first in th RH plan. igur 11 shows a cas whr th two plans of R3 produc diffrnt rsults. inally, th dottd dg that starts and nds at th knn-lction nod in ig. 8 stats that two knn-lction prdicats cannot b combind or sparatd. This is: σ θknn1, () θ knn2, ()() σ θknn1, ()(σ θknn2, ()()). (R4) ig. 12 rprsnts R4 and shows a cas whr th plans that combin and sparat two knn-l. prdicats gnrat diffrnt rsults. TPC-H xampl of R2: List ordrs that ar among th smallst 20 ordrs and that still gnratd a rvnu of about $50,000(±5,000). Th QL and valuation plans basd on R2 ar prsntd blow. LCT * ROM ORDR O WHR o_totalpric WITHIN 5000 O AND o_totalpric 20 TOP_CLOT_NIGHBOR_O 0; σ θε=5000,=50000 (o totalpric) θ knn =20,=0 (o totalpric)(o) σ θε=5000,=50000 (o totalpric)(σ θknn =20,=0 (o totalpric)(o)). Proof of Rul R1 Considr a gnric tupl t of. W will show that for any possibl valu of t, th rsults gnratd by th plans of both sids of th rul ar th sam. Th top part of ig. 13.a shows a graphical rprsntation of Rul R1. Using th concptual valuation ordr of similarity quris, w can transform th lft

9 imilarity Quris: Thir Concptual valuation, Transformations, and Procssing 9 knn1=4 knn2=4 knn1, () knn2, () knn1, () knn2, () knn1=4 knn2=4 ig. 12 Combining/sparating knn-l. and knn-l. (R4) part of th rul to an quivalnt xprssion that uss th intrsction opration as rprsntd in th bottom part of ig. 13.a. W will us this scond vrsion of th rul in th rmaining part of th proof. ig. 13.b givs th diffrnt possibl rgions for th valu of t. (1D). 1. Whn th valu of t. blongs to rgion A. In th LH plan, t is not slctd in any of th ε-lction oprators sinc it dos not satisfy any of th slction prdicats. Thus, no output is gnratd by this plan. In th RH plan, t is filtrd out by th bottom slction. No tupl flows to th top slction. Thus, no output is gnratd by this plan ithr. 2. Whn th valu of t. blongs to B. In th LH plan, t is slctd in th lft ε-lction but not in th right on. Th intrsction oprator dos not produc any output. No output is gnratd by this plan. In th RH plan, t is filtrd out by th bottom slction. No tupl flows to th top slction. No output is gnratd by this plan ithr. 3. Whn th valu of t. blongs to C. In th LH plan, t is slctd by both ε-lction oprators. Thus, t blongs to th output of th intrsction oprator. t blongs to th output of th LH plan. In th RH plan, t is slctd by th bottom ε- lction. t is also slctd by th top ε-lction. Thus, t blongs also to th output of th RH plan. 4. Whn th valu of t. blongs to D. In th LH plan, t is slctd in th right ε-lction but not in th lft on. Th intrsction oprator dos not produc any output. In th RH plan, t is slctd by th bottom ε-lction but filtrd out by th top on. No output is gnratd by this plan ithr. W can xtnd th proof to othr data typs idntifying th corrsponding rgions A-D. ig. 13.c shows th rgions for 2D data. or string data and dit distanc, B ar th strings within ε1 of but not within ε2 of, D ar th strings within ε2 of but not within ε1 of. C and A ar th strings that satisfy both or non of th conditions (within ε1 of, within ε2 of ), rspctivly. Gnric rmarks about proofs Most of th prsntd ruls, with th xcption of th ruls involving aggrgations, can b provd following a similar approach as th 1, () 1, () 2, () 1, () 2, () 2, () 2, () (a) quivalnt plans A B C A (b) Rgions of valus of possibl input tupls (1D) 1, () 1 C 2 D A B D (c) Rgions of valus of possibl input tupls (2D) ig. 13 Combining ε-l. and ε-l. (R1) - proof knn- Join -Join -lction knn- lction kd-join Join- Around (a) Whn th sl. attribut is th innr attr. in th join prdicat knn- Join -Join -lction knn- lction kd-join Join- Around (b) Whn th sl. attribut is th outr attr. in th join prdicat ig. 14 Possibl ways to combin and sparat and J on usd in th proof of R1, i.., idntifying all th distinct domain rgions of th rul attributs, and showing that th RH and LH xprssions of th rul gnrat th sam output in ach rgion. This papr prsnts proofs of multipl ruls. Th proofs of othr ruls can b asily constructd using th dscribd gnric approach. Additional proofs ar includd in [25] Combining/parating imilarity lction and imilarity Join and J prdicats can b combind or sparatd using th following gnric ruls. Whn th slction prdicat attribut is th innr attribut in th join prdicat: GR2. σ θ1 θ 2 () σ θ1 (σ θ2 ()), if thr is a dirctd dg from 2 to 1 in ig. 14.a. Whn th slction prdicat attribut is th outr attribut in th join prdicat: GR3. σ θ1 θ 2 () σ θ1 (σ θ2 ()), if thr is a dirctd dg from 2 to 1 in ig. 14.b. A prdicat of th form θ can b instantiatd as a imilarity l. (θ,c ()) or imilarity Join (θ (, f)) prdicat. igur 14 graphically rprsnts all th ways in which and J prdicats can b combind. Th following obsrvations can b drawn from it:

10 10 Yasin N. ilva t al. 1 (1,2) 2,C (2) C 1 (1,2) 2,C (2) C 2,C (2) 1 (1,2) C ig. 15 Combining/sparating ε-join and ε-l. (R5) W considr two gnric cass: whn th slction prdicat attribut is th outr attribut in th join prdicat, and whn it is th innr on. This distinction is rlvant, i.., gnrats diffrnt quivalnc ruls, whn th J opration is not commutativ (knn-join and Join-Around). In gnral, if th join opration is commutativ (ε-join and kd-join), th ruls for both cass ar th sam. Commutativity of J oprations is discussd in ction inc Join-Around is a hybrid btwn th knn- Join with k=1 and th ε-join, th way this opration can b combind with a givn oprator corrsponds to th most rstrictd way in which th knn-join or th ε-join can b combind with that oprator. This obsrvation applis in fact to any rul that uss Join-Around. Th ruls whr th slction attribut is th innr join attribut (ig. 14.a) ar qual to or mor rstrictiv than th corrsponding ruls whr th slction attribut is th outr join attribut (ig. 14.b). Th instancs of GR2 and GR3 ar prsntd in Appndix A (R5-R31). W dscrib svral of thm nxt. Basd on GR2 (whn th slction attribut is th innr join attribut), th doubly dirctd dg btwn nods ε-join and ε-l. in ig. 14.a stats that ths prdicats can b combind/sparatd in any ordr, this is: σ θε1 (1,2) θ ε2,c (2)() σ θε1 (1,2)(σ θε2,c (2)()) σ θε2,c (2)(σ θε1 (1,2)()). (R5) Obsrv that th middl plan of R5 xcuts th ε-l. first, whil th RH plan xcuts th ε-join first. R5 is graphically rprsntd in ig. 15. Also considring GR2, th dirctd dg from knn-join to ε- lction in ig. 14.a rprsnts that ths prdicats can b combind or sparatd only if th knn-join is xcutd bfor th ε-lction, this is: σ θknn (1,2) θ ε,c (2)() σ θknn (1,2)(σ θε,c (2)()).(R8) σ θknn (1,2) θ ε,c (2)() σ θε,c (2)(σ θknn (1,2)()).(R9) 1 knn=2 knn(1,2),c(2) 2 C,C(2) knn(1,2) knn(1,2),c(2) knn=2 1 2 C 1 knn=2 2 C ig. 16 Combining/spar. knn-join and ε-l. (R8, R9) Th RH plan of R8 xcuts ε-l. first and producs a diffrnt rsult than th LH plan. This is illustratd in th bottom plan of ig. 16. Th RH plan of R9, on th othr hand, xcuts knn-join first and is quivalnt to th LH plan. This is illustratd in th top plan of th sam figur. Lt us considr now th sam pair of nods (knn-join and ε-lction) undr GR3 (whn th slction attribut is th outr join attribut). Th dg btwn ths nods is now a doubly dirctd dg (ig. 14.b) and consquntly th prdicats can b combind or sparatd in any ordr: σ θknn (1,2) θ ε,c (1)() σ θknn (1,2)(σ θε,c (1)()) σ θε,c (1)(σ θknn (1,2)()). (R23) Obsrv that th middl plan of R23 xcuts th ε- l. first whil th RH plan xcuts th knn-join first. ig. 17 shows an xampl of R23. inally, considring also GR3, th dottd dg btwn kd-join and knn-lction in ig. 14.b spcifis that ths prdicats cannot b combind or sparatd in any ordr (R27, R28). TPC-H xampl of R23: ind th closst thr supplirs for vry customr within 100 mils from our Chicago hadquartrs (X,Y). Th QL and valuation plans basd on R23 ar prsntd blow. LCT c_custky, s_suppky ROM CUTOMR C, UPPLIR WHR c_loc WITHIN 100 O (X,Y) AND s_loc 3 TOP_CLOT_NIGHBOR_O c_loc; σ θknn =3 (c loc,s loc) θ ε=100,c=(x,y ) (c loc)(c ) σ θknn =3 (c loc,s loc)(σ θε=100,c=(x,y ) (c loc)(c )) σ θε=100,c=(x,y ) (c loc)(σ θknn =3 (c loc,s loc)(c )). Ths plans can b furthr transformd using additional ruls. or instanc, sinc σ θ (,f)( ) θ (,f) (s ction 3.2), th last plan is quivalnt to: σ θε=100,c=(x,y ) (c loc)(c θknn =3 (c loc,s loc) ). Proof sktch of Rul R9 knn-join is dfind ovr two rlations. Assum that θ knn is dfind ovr rlations 1

11 imilarity Quris: Thir Concptual valuation, Transformations, and Procssing 11 C 1 knn=2 knn(1,2),c(1) 2 C knn (1,2),C(1) knn=2 1 2 C knn=2 1,C(1) knn(1,2) 2 ig. 17 Combining/sparating knn-join and ε-l. (R23) (2,3) (1,2) (1,2) (2,3) -Join (1,2) (2,3) (2,3) (1,2) kd-join (1,2) (2,3) (2,3) knn- Join Join- Around (1,2) (2,3) (1,2) (1,2) (2,3) -Join (1,2) (2,3) (2,3) (1,2) kd-join (1,2) (2,3) (2,3) knn- Join Join- Around (1,2) knn (1,2),C (2) knn (1,2),C (2) knn (1,2) (a) quivalnt plans,c (2),C (2) knn (1,2) t t 2 knn C A A B D M (b) Rgions of valus of possibl input tupls ig. 18 Combining knn-join and ε-l. (R9) - proof and 2, and that th input rlation is th cross product of all th rlations involvd in th similarity-awar prdicats, i.., = 1 2. urthrmor, w assum that th join attributs ar 1. 1 and Considr a gnric tupl t 1 of 1. W will show that for any possibl pair (t 1,t 2 ), whr t 2 is a tupl of 2, th rsults gnratd by th plans of both sids of th rul ar th sam. Th top part of ig. 18.a givs a graphical rprsntation of Rul R9. Using th concptual valuation ordr of similarity quris, w can transform th lft part of th rul to an quivalnt xprssion that uss th intrsction opration as rprsntd in th bottom part of ig. 18.a. W will us this vrsion of th rul in th rmaining part of th proof. ig. 18.b givs th diffrnt possibl rgions for th valu of t Not that th rgion markd as knn (which compriss rgions B and M) rprsnts th rgion that contains th knn closst nighbors of t 1 in Whn th valu of t 2. 2 blongs to A. In th LH plan, (t 1,t 2 ) is not slctd in any of th oprators. No output is gnratd by this plan. In th RH plan, (t 1,t 2 ) is filtrd out by th bottom slction sinc t 2 is not on of th knn closst nighbors of t 1 in 2. No tupl flows to th top oprator and no output is gnratd by this plan. 2. Whn th valu of t 2. 2 blongs to B. In th LH plan, th pair (t 1,t 2 ) is slctd in th lft oprator but not in th right on. Th intrsction oprator dos not produc any output and consquntly no output is gnratd by this plan. In th (a) Whn th attributs in th prdicats hav a singl dirction: 12, 23 (b) Whn th attributs in th prdicats do not hav a singl dirction: 12, 23 ig. 19 Possibl ways to combin/sparat J prdicats RH plan, (t 1,t 2 ) is slctd in th bottom slction sinc t 2 is on of th knn closst nighbors of t 1 in 2. Howvr, (t 1,t 2 ) is filtrd out by th top slction bcaus dist(t 2. 2, C) > ε. Thus, no output is gnratd by this plan ithr. 3. Whn th valu of t 2. 2 blongs to M. In th LH plan, (t 1,t 2 ) is slctd in both oprators. Consquntly, (t 1,t 2 ) blongs to th output of th intrsction and th LH plan. In th RH plan, (t 1,t 2 ) is slctd by th bottom slction sinc t 2 is on of th knn closst nighbors of t 1 in 2. (t 1,t 2 ) is also slctd by th top slction sinc dist(t 2. 2, C) ε. Thus, (t 1,t 2 ) blongs also to th output of th RH plan. 4. Whn th valu of t 2. 2 blongs to D. In th LH plan, th pair (t 1,t 2 ) is slctd in th right similarity oprator but not in th lft on. Th intrsction oprator dos not produc any output and thus no output is gnratd by this plan. In th RH plan, (t 1,t 2 ) is filtrd out by th bottom slction. No tupl flows to th top oprator. Thus, no output is gnratd by this plan ithr Combining/parating imilarity Join Prdicats Multipl J prdicats can b combind or sparatd using th following gnral ruls. Whn th attributs in th prdicats hav a singl dirction (1 2, 2 3): GR4. σ θ1 (1,2) θ 2 (2,3)() σ θ1 (1,2)(σ θ2 (2,3) ()), and σ θ1 (1,2) θ 2 (2,3)() σ θ2 (2,3) (σ θ1 (1,2)()), if th graph of ig. 19.a has a doubly dirctd dg of th form: 1 (1,2) (2,3) 2. GR5. σ θ1 (1,2) θ 2 (2,3)() σ θ1 (1,2)(σ θ2 (2,3) ()), and σ θ1 (1,2) θ 2 (2,3)() σ θ2 (2,3) (σ θ1 (1,2)()), if th graph of ig. 19.a has a dirctd dg of th form: 1 (1,2) (2,3) 2.

12 12 Yasin N. ilva t al (1,2) 2 (2,3) 3 1 (1,2) 2 (2,3) (2,3) 1 (1,2) ig. 20 Combining/sparating two ε-join prdicats (R32) Whn th attributs in th prdicats do not hav a singl dirction (1 2, 2 3): GR6. σ θ1 (1,2) θ 2 (3,2)() σ θ1 (1,2)(σ θ2 (3,2) ()), and σ θ1 (1,2) θ 2 (3,2)() σ θ2 (3,2) (σ θ1 (1,2)()), if th graph of ig. 19.b has a doubly dirctd dg of th form: 1 (1,2) (3,2) 2. GR7. σ θ1 (1,2) θ 2 (3,2)() σ θ1 (1,2)(σ θ2 (3,2) ()), and σ θ1 (1,2) θ 2 (3,2)() σ θ2 (3,2) (σ θ1 (1,2)()), if th graph of ig. 19.b has a dirctd dg of th form: 1 (1,2) (3,2) 2. If th dg btwn two nods is dottd in ig. 19, non of th quivalncs prsntd in ruls GR4 or GR6 hold. Th graphs in ig. 19 show th diffrnt ways in which two J prdicats can b combind/sparatd. Two cass ar considrd: whn th attributs in th prdicats hav a singl dirction,.g., 1 2, 2 3; and whn this is not th cas,.g., 1 2, 2 3. In gnral, this classification gnrats diffrnt quivalnc ruls whn at last on of th J oprations is not commutativ (knn-join and Join-Around). Appndix A prsnts all th instancs of GR4-GR7 (R32-R65). W dscrib som of ths hr. Undr GR4 (prdicats attributs hav a singl dirction: 1 2, 2 3), th doubly dirctd dg that starts and nds at th ε-join nod in ig. 19.a spcifis that two ε-join prdicats can b combind in any ordr. This is: σ θε1(1,2) θ ε2(2,3)() σ θε1(1,2)(σ θε2(2,3)()) σ θε2(2,3)(σ θε1(1,2)()). (R32) Rul R32 is prsntd graphically in ig. 20. Undr GR7 (prdicats attributs do not hav a singl dirction: 1 2, 2 3), th dirctd dg from knn-join to ε-join in ig. 19.b stats that ths prdicats can b combind xcuting knn-join first: σ θε (1,2) θ knn (3,2)() σ θε(1,2)(σ θknn (3,2)()). σ θε(1,2) θ knn (3,2)() σ θknn (3,2)(σ θε (1,2)()). (R52) (R53) knn-join is xcutd first in th RH plan of R52 whil ε-join is xcutd first in th RH plan of R53. Tabl 1 Cass whr slction can b pushd blow join (a) Whn th sl. prdicat attrib. is th outr attrib. in th join prdicat Rg. Join -Join knn-join kd-join Join-Around Rg. lction -lction knn-lction (b) Whn th sl. prdicat attrib. is th innr attrib. in th join prdicat Rg. Join -Join knn-join kd-join Join-Around Rg. lction -lction knn-lction 4.2 Othr Cor quivalnc Ruls Commutativity of imilarity Join Oprators om J oprations (ε-join and kd-join) ar commutativ as spcifid by th following gnral rul. GR8. θ (,f) θ (,f), whn (i) is ε- Join or kd-join but not knn-join or Join-Around, and (ii) th distanc function usd in th oprations is symmtric Distribution of (imilarity or Rgular) lction ovr (imilarity or Rgular) Join Pushing slction blow join (distributing slction ovr join) is on of th most usful ruls in rgular rlational algbra. In this sction w xtnd this rul to th cas of and J. imilarity or rgular slction oprations can b pushd blow similarity or rgular join oprations according to th following gnral ruls. Whn th slction prdicat attribut is th outr attribut in th join prdicat: GR9. σ θ1 ()( θ2 (,f) ) (σ θ1 ()()) θ2 (,f), if cll [1, 2] in Tabl 1.a is chckd. Whn th slction prdicat attribut is th innr attribut in th join prdicat: GR10. σ θ1 (f)( θ2 (,f) ) θ2 (,f) (σ θ1 (f) ( )), if cll [1, 2] in Tabl 1.b is chckd. Tabl 1 summarizs all th cass whr a slction oprator (rgular or similarity-awar) can b pushd blow a join (rgular or similarity-awar). This tabl and gnral ruls GR9 and GR10 considr two gnric cass: whn th slction attribut is th outr attribut of th join prdicat and whn it is th innr on. Th instancs of GR9 and GR10 (R70-R101) ar includd in Appndix A. om of thm ar prsntd nxt. In som cass, a givn slction typ can b pushd blow ithr input of a join. or instanc, this is th cas for rgular slction and ε-join. Not that both

13 imilarity Quris: Thir Concptual valuation, Transformations, and Procssing 13 () (,f) (a) quivalnt plans () (,f) a 1 a 2 t f t A A B (b) Rgions of valus of possibl input tupls ig. 21 Distribution of slction ovr ε-join (R70) C,C() knn(,f) knn=2 knn=2 knn(,f) C f,c() f ig. 22 Distribution of ε-l. ovr knn-join - whn sl. is pushd blow th outr rlation (R88) knn(,f) knn=2,c(f) knn=2 C knn(,f) f,c(f) f C ig. 23 Distribution of ε-l. ovr knn-join - whn sl. is pushd blow th innr rlation (R89) clls [Rgular lction, ε-join] in Tabls 1.a and 1.b hav a chck mark. Using GR9 and GR10 w obtain: σ θ() ( θε(,f) ) (σ θ() ()) θε(,f). σ θ(f) ( θε(,f) ) θε(,f) (σ θ(f) ( )). (R70) (R71) In R70, slction is pushd blow th outr input of ε-join; in R70, blow th innr on. ig. 21.a rprsnts Rul R70 graphically. imilarly, for th cas of ε-lction and ε-join w hav: σ θε1,c ()( θε2 (,f) ) (σ θε1,c ()()) θε2(,f). (R86) σ θε1,c (f)( θε2 (,f) ) θε2 (,f) (σ θε1,c (f)( )). (R87) In othr cass, slction can only b pushd blow th outr input of a join. This is th cas for ε-join and knn-join. Not that th cll [Rgular lction, ε-join] has a chck mark only in Tabl 1.a. Using GR9 and GR10 w gt th following ruls: σ θε,c ()( θknn (,f) ) (σ θε,c ()()) θknn (,f). (R88) σ θε,c (f)( θknn (,f) ) θknn (,f) (σ θε,c (f)( )). (R89) ig. 22 shows that pushing ε-l. blow th outr input of knn-join gnrats th sam rsult as xcuting knn-join first. On th othr hand, pushing ε-l. blow th innr input of knn-join can gnrat a diffrnt rsult as sn in ig. 23. TPC-H xampl of R86: Considring th customrs that ar locatd within 200 mils from our Chicago hadquartrs (X,Y), idntify th customrs that ar locatd within 10 mils of crtain locations of intrst (INTR LOCATION). Th QL and valuation plans basd on R86 ar prsntd blow. LCT c_custky, il_locnam ROM CUTOMR C, INTR_LOCATION IL WHR c_loc WITHIN 10 O il_loc AND c_loc WITHIN 200 O (X,Y); σ θε1=200,c=(x,y ) (c loc)(c θε2=10(c loc,il loc) IL) (σ θε1=200,c=(x,y ) (c loc)(c)) θε2=10 (c loc,il loc) IL. Proof sktch of Rul R70 Th join attributs in θ ε ar. and.f, and θ is dfind ovr.. Considr a gnric tupl t of. W will show that for any possibl pair (t,t ), whr t is a tupl of, th rsults gnratd by th plans of both sids of th rul ar th sam. ig. 21.b givs th diffrnt possibl rgions for th valus of t.f and two gnric valus of t.. a 2 rprsnts a valu that satisfis th prdicat θ whil a 1 rprsnts a valu that dos not. 1. Whn th valu of t. is a 1. In th LH plan, th pair (t,t ) may or may not blong to th output of th ε-join. Howvr, (t,t ) will b filtrd out by th slction oprator sinc a 1 dos not satisfy th prdicat θ. Thus, no output is gnratd by this plan. In th RH plan, t is filtrd out by th slction sinc a 1 dos not satisfy θ. No tupl flows to th ε-join oprator from its outr input. Thus, no output is gnratd by this plan ithr. 2. Whn th valu of t. is a 2 and th valu of t.f blongs to A. In th LH plan, th pair (t,t ) dos not blong to th output of th ε-join sinc dist(t., t.f) > ε. No tupl flows to th slction oprator. Thus, no output is gnratd by this plan. In th RH plan, t is slctd by th rgular slction oprator sinc a 2 satisfis θ. Howvr, th pair (t,t ) dos not blong to th output of th ε-join sinc dist(t., t.f) > ε. Thus, no output is gnratd by this plan ithr. 3. Whn t. is a 2 and th valu of t.f blongs to B. In th LH plan, th pair (t,t ) blongs to th output of th ε-join sinc dist(t., t.f) ε. (t,t ) is also slctd by th rgular oprator sinc a 2 satisfis θ. (t,t ) blongs to th output of this plan. In th RH plan, t is slctd by th slction oprator sinc a 2 satisfis θ. (t,t ) blongs to th output of th ε-join sinc dist(t., t.f) ε. Thus, (t,t ) also blongs to th output of this plan Associativity of imilarity Join Oprators Associativity of join oprators is anothr cor transformation rul commonly usd in qury optimization. This

Addition of angular momentum

Addition of angular momentum Addition of angular momntum April, 0 Oftn w nd to combin diffrnt sourcs of angular momntum to charactriz th total angular momntum of a systm, or to divid th total angular momntum into parts to valuat th

More information

Addition of angular momentum

Addition of angular momentum Addition of angular momntum April, 07 Oftn w nd to combin diffrnt sourcs of angular momntum to charactriz th total angular momntum of a systm, or to divid th total angular momntum into parts to valuat

More information

(Upside-Down o Direct Rotation) β - Numbers

(Upside-Down o Direct Rotation) β - Numbers Amrican Journal of Mathmatics and Statistics 014, 4(): 58-64 DOI: 10593/jajms0140400 (Upsid-Down o Dirct Rotation) β - Numbrs Ammar Sddiq Mahmood 1, Shukriyah Sabir Ali,* 1 Dpartmnt of Mathmatics, Collg

More information

1 Minimum Cut Problem

1 Minimum Cut Problem CS 6 Lctur 6 Min Cut and argr s Algorithm Scribs: Png Hui How (05), Virginia Dat: May 4, 06 Minimum Cut Problm Today, w introduc th minimum cut problm. This problm has many motivations, on of which coms

More information

cycle that does not cross any edges (including its own), then it has at least

cycle that does not cross any edges (including its own), then it has at least W prov th following thorm: Thorm If a K n is drawn in th plan in such a way that it has a hamiltonian cycl that dos not cross any dgs (including its own, thn it has at last n ( 4 48 π + O(n crossings Th

More information

Construction of asymmetric orthogonal arrays of strength three via a replacement method

Construction of asymmetric orthogonal arrays of strength three via a replacement method isid/ms/26/2 Fbruary, 26 http://www.isid.ac.in/ statmath/indx.php?modul=prprint Construction of asymmtric orthogonal arrays of strngth thr via a rplacmnt mthod Tian-fang Zhang, Qiaoling Dng and Alok Dy

More information

CS 361 Meeting 12 10/3/18

CS 361 Meeting 12 10/3/18 CS 36 Mting 2 /3/8 Announcmnts. Homwork 4 is du Friday. If Friday is Mountain Day, homwork should b turnd in at my offic or th dpartmnt offic bfor 4. 2. Homwork 5 will b availabl ovr th wknd. 3. Our midtrm

More information

Week 3: Connected Subgraphs

Week 3: Connected Subgraphs Wk 3: Connctd Subgraphs Sptmbr 19, 2016 1 Connctd Graphs Path, Distanc: A path from a vrtx x to a vrtx y in a graph G is rfrrd to an xy-path. Lt X, Y V (G). An (X, Y )-path is an xy-path with x X and y

More information

CS 6353 Compiler Construction, Homework #1. 1. Write regular expressions for the following informally described languages:

CS 6353 Compiler Construction, Homework #1. 1. Write regular expressions for the following informally described languages: CS 6353 Compilr Construction, Homwork #1 1. Writ rgular xprssions for th following informally dscribd languags: a. All strings of 0 s and 1 s with th substring 01*1. Answr: (0 1)*01*1(0 1)* b. All strings

More information

1 Isoparametric Concept

1 Isoparametric Concept UNIVERSITY OF CALIFORNIA BERKELEY Dpartmnt of Civil Enginring Spring 06 Structural Enginring, Mchanics and Matrials Profssor: S. Govindj Nots on D isoparamtric lmnts Isoparamtric Concpt Th isoparamtric

More information

Quasi-Classical States of the Simple Harmonic Oscillator

Quasi-Classical States of the Simple Harmonic Oscillator Quasi-Classical Stats of th Simpl Harmonic Oscillator (Draft Vrsion) Introduction: Why Look for Eignstats of th Annihilation Oprator? Excpt for th ground stat, th corrspondnc btwn th quantum nrgy ignstats

More information

A Propagating Wave Packet Group Velocity Dispersion

A Propagating Wave Packet Group Velocity Dispersion Lctur 8 Phys 375 A Propagating Wav Packt Group Vlocity Disprsion Ovrviw and Motivation: In th last lctur w lookd at a localizd solution t) to th 1D fr-particl Schrödingr quation (SE) that corrsponds to

More information

3 Finite Element Parametric Geometry

3 Finite Element Parametric Geometry 3 Finit Elmnt Paramtric Gomtry 3. Introduction Th intgral of a matrix is th matrix containing th intgral of ach and vry on of its original componnts. Practical finit lmnt analysis rquirs intgrating matrics,

More information

General Notes About 2007 AP Physics Scoring Guidelines

General Notes About 2007 AP Physics Scoring Guidelines AP PHYSICS C: ELECTRICITY AND MAGNETISM 2007 SCORING GUIDELINES Gnral Nots About 2007 AP Physics Scoring Guidlins 1. Th solutions contain th most common mthod of solving th fr-rspons qustions and th allocation

More information

Elements of Statistical Thermodynamics

Elements of Statistical Thermodynamics 24 Elmnts of Statistical Thrmodynamics Statistical thrmodynamics is a branch of knowldg that has its own postulats and tchniqus. W do not attmpt to giv hr vn an introduction to th fild. In this chaptr,

More information

The Matrix Exponential

The Matrix Exponential Th Matrix Exponntial (with xrciss) by D. Klain Vrsion 207.0.05 Corrctions and commnts ar wlcom. Th Matrix Exponntial For ach n n complx matrix A, dfin th xponntial of A to b th matrix A A k I + A + k!

More information

perm4 A cnt 0 for for if A i 1 A i cnt cnt 1 cnt i j. j k. k l. i k. j l. i l

perm4 A cnt 0 for for if A i 1 A i cnt cnt 1 cnt i j. j k. k l. i k. j l. i l h 4D, 4th Rank, Antisytric nsor and th 4D Equivalnt to th Cross Product or Mor Fun with nsors!!! Richard R Shiffan Digital Graphics Assoc 8 Dunkirk Av LA, Ca 95 rrs@isidu his docunt dscribs th four dinsional

More information

CPSC 665 : An Algorithmist s Toolkit Lecture 4 : 21 Jan Linear Programming

CPSC 665 : An Algorithmist s Toolkit Lecture 4 : 21 Jan Linear Programming CPSC 665 : An Algorithmist s Toolkit Lctur 4 : 21 Jan 2015 Lcturr: Sushant Sachdva Linar Programming Scrib: Rasmus Kyng 1. Introduction An optimization problm rquirs us to find th minimum or maximum) of

More information

CE 530 Molecular Simulation

CE 530 Molecular Simulation CE 53 Molcular Simulation Lctur 8 Fr-nrgy calculations David A. Kofk Dpartmnt of Chmical Enginring SUNY Buffalo kofk@ng.buffalo.du 2 Fr-Enrgy Calculations Uss of fr nrgy Phas quilibria Raction quilibria

More information

The Matrix Exponential

The Matrix Exponential Th Matrix Exponntial (with xrciss) by Dan Klain Vrsion 28928 Corrctions and commnts ar wlcom Th Matrix Exponntial For ach n n complx matrix A, dfin th xponntial of A to b th matrix () A A k I + A + k!

More information

CHAPTER 1. Introductory Concepts Elements of Vector Analysis Newton s Laws Units The basis of Newtonian Mechanics D Alembert s Principle

CHAPTER 1. Introductory Concepts Elements of Vector Analysis Newton s Laws Units The basis of Newtonian Mechanics D Alembert s Principle CHPTER 1 Introductory Concpts Elmnts of Vctor nalysis Nwton s Laws Units Th basis of Nwtonian Mchanics D lmbrt s Principl 1 Scinc of Mchanics: It is concrnd with th motion of matrial bodis. odis hav diffrnt

More information

Higher order derivatives

Higher order derivatives Robrto s Nots on Diffrntial Calculus Chaptr 4: Basic diffrntiation ruls Sction 7 Highr ordr drivativs What you nd to know alrady: Basic diffrntiation ruls. What you can larn hr: How to rpat th procss of

More information

Search sequence databases 3 10/25/2016

Search sequence databases 3 10/25/2016 Sarch squnc databass 3 10/25/2016 Etrm valu distribution Ø Suppos X is a random variabl with probability dnsity function p(, w sampl a larg numbr S of indpndnt valus of X from this distribution for an

More information

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK Data Assimilation 1 Alan O Nill National Cntr for Earth Obsrvation UK Plan Motivation & basic idas Univariat (scalar) data assimilation Multivariat (vctor) data assimilation 3d-Variational Mthod (& optimal

More information

Brief Introduction to Statistical Mechanics

Brief Introduction to Statistical Mechanics Brif Introduction to Statistical Mchanics. Purpos: Ths nots ar intndd to provid a vry quick introduction to Statistical Mchanics. Th fild is of cours far mor vast than could b containd in ths fw pags.

More information

ECE602 Exam 1 April 5, You must show ALL of your work for full credit.

ECE602 Exam 1 April 5, You must show ALL of your work for full credit. ECE62 Exam April 5, 27 Nam: Solution Scor: / This xam is closd-book. You must show ALL of your work for full crdit. Plas rad th qustions carfully. Plas chck your answrs carfully. Calculators may NOT b

More information

On spanning trees and cycles of multicolored point sets with few intersections

On spanning trees and cycles of multicolored point sets with few intersections On spanning trs and cycls of multicolord point sts with fw intrsctions M. Kano, C. Mrino, and J. Urrutia April, 00 Abstract Lt P 1,..., P k b a collction of disjoint point sts in R in gnral position. W

More information

10. Limits involving infinity

10. Limits involving infinity . Limits involving infinity It is known from th it ruls for fundamntal arithmtic oprations (+,-,, ) that if two functions hav finit its at a (finit or infinit) point, that is, thy ar convrgnt, th it of

More information

Symmetric centrosymmetric matrix vector multiplication

Symmetric centrosymmetric matrix vector multiplication Linar Algbra and its Applications 320 (2000) 193 198 www.lsvir.com/locat/laa Symmtric cntrosymmtric matrix vctor multiplication A. Mlman 1 Dpartmnt of Mathmatics, Univrsity of San Francisco, San Francisco,

More information

Homotopy perturbation technique

Homotopy perturbation technique Comput. Mthods Appl. Mch. Engrg. 178 (1999) 257±262 www.lsvir.com/locat/cma Homotopy prturbation tchniqu Ji-Huan H 1 Shanghai Univrsity, Shanghai Institut of Applid Mathmatics and Mchanics, Shanghai 272,

More information

Estimation of apparent fraction defective: A mathematical approach

Estimation of apparent fraction defective: A mathematical approach Availabl onlin at www.plagiarsarchlibrary.com Plagia Rsarch Library Advancs in Applid Scinc Rsarch, 011, (): 84-89 ISSN: 0976-8610 CODEN (USA): AASRFC Estimation of apparnt fraction dfctiv: A mathmatical

More information

The van der Waals interaction 1 D. E. Soper 2 University of Oregon 20 April 2012

The van der Waals interaction 1 D. E. Soper 2 University of Oregon 20 April 2012 Th van dr Waals intraction D. E. Sopr 2 Univrsity of Orgon 20 pril 202 Th van dr Waals intraction is discussd in Chaptr 5 of J. J. Sakurai, Modrn Quantum Mchanics. Hr I tak a look at it in a littl mor

More information

Principles of Humidity Dalton s law

Principles of Humidity Dalton s law Principls of Humidity Dalton s law Air is a mixtur of diffrnt gass. Th main gas componnts ar: Gas componnt volum [%] wight [%] Nitrogn N 2 78,03 75,47 Oxygn O 2 20,99 23,20 Argon Ar 0,93 1,28 Carbon dioxid

More information

Background: We have discussed the PIB, HO, and the energy of the RR model. In this chapter, the H-atom, and atomic orbitals.

Background: We have discussed the PIB, HO, and the energy of the RR model. In this chapter, the H-atom, and atomic orbitals. Chaptr 7 Th Hydrogn Atom Background: W hav discussd th PIB HO and th nrgy of th RR modl. In this chaptr th H-atom and atomic orbitals. * A singl particl moving undr a cntral forc adoptd from Scott Kirby

More information

Abstract Interpretation: concrete and abstract semantics

Abstract Interpretation: concrete and abstract semantics Abstract Intrprtation: concrt and abstract smantics Concrt smantics W considr a vry tiny languag that manags arithmtic oprations on intgrs valus. Th (concrt) smantics of th languags cab b dfind by th funzcion

More information

COHORT MBA. Exponential function. MATH review (part2) by Lucian Mitroiu. The LOG and EXP functions. Properties: e e. lim.

COHORT MBA. Exponential function. MATH review (part2) by Lucian Mitroiu. The LOG and EXP functions. Properties: e e. lim. MTH rviw part b Lucian Mitroiu Th LOG and EXP functions Th ponntial function p : R, dfind as Proprtis: lim > lim p Eponntial function Y 8 6 - -8-6 - - X Th natural logarithm function ln in US- log: function

More information

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation.

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation. Lur 7 Fourir Transforms and th Wav Euation Ovrviw and Motivation: W first discuss a fw faturs of th Fourir transform (FT), and thn w solv th initial-valu problm for th wav uation using th Fourir transform

More information

That is, we start with a general matrix: And end with a simpler matrix:

That is, we start with a general matrix: And end with a simpler matrix: DIAGON ALIZATION OF THE STR ESS TEN SOR INTRO DUCTIO N By th us of Cauchy s thorm w ar abl to rduc th numbr of strss componnts in th strss tnsor to only nin valus. An additional simplification of th strss

More information

Roadmap. XML Indexing. DataGuide example. DataGuides. Strong DataGuides. Multiple DataGuides for same data. CPS Topics in Database Systems

Roadmap. XML Indexing. DataGuide example. DataGuides. Strong DataGuides. Multiple DataGuides for same data. CPS Topics in Database Systems Roadmap XML Indxing CPS 296.1 Topics in Databas Systms Indx fabric Coopr t al. A Fast Indx for Smistructurd Data. VLDB, 2001 DataGuid Goldman and Widom. DataGuids: Enabling Qury Formulation and Optimization

More information

2013 Specialist Mathematics GA 3: Written examination 2

2013 Specialist Mathematics GA 3: Written examination 2 0 0 Spcialist Mathmatics GA : Writtn xamination GENERAL COMMENTS Th 0 Spcialist Mathmatics xamination comprisd multipl-choic qustions (worth marks) and fiv xtndd qustions (worth 8 marks). Th papr smd accssibl

More information

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches.

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches. Subjct Chmistry Papr No and Titl Modul No and Titl Modul Tag 8/ Physical Spctroscopy / Brakdown of th Born-Oppnhimr approximation. Slction ruls for rotational-vibrational transitions. P, R branchs. CHE_P8_M

More information

Application of Vague Soft Sets in students evaluation

Application of Vague Soft Sets in students evaluation Availabl onlin at www.plagiarsarchlibrary.com Advancs in Applid Scinc Rsarch, 0, (6):48-43 ISSN: 0976-860 CODEN (USA): AASRFC Application of Vagu Soft Sts in studnts valuation B. Chtia*and P. K. Das Dpartmnt

More information

Computing and Communications -- Network Coding

Computing and Communications -- Network Coding 89 90 98 00 Computing and Communications -- Ntwork Coding Dr. Zhiyong Chn Institut of Wirlss Communications Tchnology Shanghai Jiao Tong Univrsity China Lctur 5- Nov. 05 0 Classical Information Thory Sourc

More information

There is an arbitrary overall complex phase that could be added to A, but since this makes no difference we set it to zero and choose A real.

There is an arbitrary overall complex phase that could be added to A, but since this makes no difference we set it to zero and choose A real. Midtrm #, Physics 37A, Spring 07. Writ your rsponss blow or on xtra pags. Show your work, and tak car to xplain what you ar doing; partial crdit will b givn for incomplt answrs that dmonstrat som concptual

More information

EEO 401 Digital Signal Processing Prof. Mark Fowler

EEO 401 Digital Signal Processing Prof. Mark Fowler EEO 401 Digital Signal Procssing Prof. Mark Fowlr Dtails of th ot St #19 Rading Assignmnt: Sct. 7.1.2, 7.1.3, & 7.2 of Proakis & Manolakis Dfinition of th So Givn signal data points x[n] for n = 0,, -1

More information

2 AN OVERVIEW OF THE TENSOR PRODUCT

2 AN OVERVIEW OF THE TENSOR PRODUCT 98 IEEE TRASACTIS PARALLEL AD DISTRIBUTED SYSTEMS, VL 10, 3, MARCH 1999 1 Th choic of data distribution has a larg influnc on th prformanc of th synthsizd programs, ur simpl algorithm for slcting th appropriat

More information

What are those βs anyway? Understanding Design Matrix & Odds ratios

What are those βs anyway? Understanding Design Matrix & Odds ratios Ral paramtr stimat WILD 750 - Wildlif Population Analysis of 6 What ar thos βs anyway? Undrsting Dsign Matrix & Odds ratios Rfrncs Hosmr D.W.. Lmshow. 000. Applid logistic rgrssion. John Wily & ons Inc.

More information

The graph of y = x (or y = ) consists of two branches, As x 0, y + ; as x 0, y +. x = 0 is the

The graph of y = x (or y = ) consists of two branches, As x 0, y + ; as x 0, y +. x = 0 is the Copyright itutcom 005 Fr download & print from wwwitutcom Do not rproduc by othr mans Functions and graphs Powr functions Th graph of n y, for n Q (st of rational numbrs) y is a straight lin through th

More information

A Sub-Optimal Log-Domain Decoding Algorithm for Non-Binary LDPC Codes

A Sub-Optimal Log-Domain Decoding Algorithm for Non-Binary LDPC Codes Procdings of th 9th WSEAS Intrnational Confrnc on APPLICATIONS of COMPUTER ENGINEERING A Sub-Optimal Log-Domain Dcoding Algorithm for Non-Binary LDPC Cods CHIRAG DADLANI and RANJAN BOSE Dpartmnt of Elctrical

More information

Aim To manage files and directories using Linux commands. 1. file Examines the type of the given file or directory

Aim To manage files and directories using Linux commands. 1. file Examines the type of the given file or directory m E x. N o. 3 F I L E M A N A G E M E N T Aim To manag ils and dirctoris using Linux commands. I. F i l M a n a g m n t 1. il Examins th typ o th givn il or dirctory i l i l n a m > ( o r ) < d i r c t

More information

Hydrogen Atom and One Electron Ions

Hydrogen Atom and One Electron Ions Hydrogn Atom and On Elctron Ions Th Schrödingr quation for this two-body problm starts out th sam as th gnral two-body Schrödingr quation. First w sparat out th motion of th cntr of mass. Th intrnal potntial

More information

From Elimination to Belief Propagation

From Elimination to Belief Propagation School of omputr Scinc Th lif Propagation (Sum-Product lgorithm Probabilistic Graphical Modls (10-708 Lctur 5, Sp 31, 2007 Rcptor Kinas Rcptor Kinas Kinas X 5 ric Xing Gn G T X 6 X 7 Gn H X 8 Rading: J-hap

More information

Basic Polyhedral theory

Basic Polyhedral theory Basic Polyhdral thory Th st P = { A b} is calld a polyhdron. Lmma 1. Eithr th systm A = b, b 0, 0 has a solution or thr is a vctorπ such that π A 0, πb < 0 Thr cass, if solution in top row dos not ist

More information

The Equitable Dominating Graph

The Equitable Dominating Graph Intrnational Journal of Enginring Rsarch and Tchnology. ISSN 0974-3154 Volum 8, Numbr 1 (015), pp. 35-4 Intrnational Rsarch Publication Hous http://www.irphous.com Th Equitabl Dominating Graph P.N. Vinay

More information

Abstract Interpretation. Lecture 5. Profs. Aiken, Barrett & Dill CS 357 Lecture 5 1

Abstract Interpretation. Lecture 5. Profs. Aiken, Barrett & Dill CS 357 Lecture 5 1 Abstract Intrprtation 1 History On brakthrough papr Cousot & Cousot 77 (?) Inspird by Dataflow analysis Dnotational smantics Enthusiastically mbracd by th community At last th functional community... At

More information

Mutually Independent Hamiltonian Cycles of Pancake Networks

Mutually Independent Hamiltonian Cycles of Pancake Networks Mutually Indpndnt Hamiltonian Cycls of Pancak Ntworks Chng-Kuan Lin Dpartmnt of Mathmatics National Cntral Univrsity, Chung-Li, Taiwan 00, R O C discipl@ms0urlcomtw Hua-Min Huang Dpartmnt of Mathmatics

More information

Self-Adjointness and Its Relationship to Quantum Mechanics. Ronald I. Frank 2016

Self-Adjointness and Its Relationship to Quantum Mechanics. Ronald I. Frank 2016 Ronald I. Frank 06 Adjoint https://n.wikipdia.org/wiki/adjoint In gnral thr is an oprator and a procss that dfin its adjoint *. It is thn slf-adjoint if *. Innr product spac https://n.wikipdia.org/wiki/innr_product_spac

More information

Homework #3. 1 x. dx. It therefore follows that a sum of the

Homework #3. 1 x. dx. It therefore follows that a sum of the Danil Cannon CS 62 / Luan March 5, 2009 Homwork # 1. Th natural logarithm is dfind by ln n = n 1 dx. It thrfor follows that a sum of th 1 x sam addnd ovr th sam intrval should b both asymptotically uppr-

More information

Final Exam Solutions

Final Exam Solutions CS 2 Advancd Data Structurs and Algorithms Final Exam Solutions Jonathan Turnr /8/20. (0 points) Suppos that r is a root of som tr in a Fionacci hap. Assum that just for a dltmin opration, r has no childrn

More information

Derangements and Applications

Derangements and Applications 2 3 47 6 23 Journal of Intgr Squncs, Vol. 6 (2003), Articl 03..2 Drangmnts and Applications Mhdi Hassani Dpartmnt of Mathmatics Institut for Advancd Studis in Basic Scincs Zanjan, Iran mhassani@iasbs.ac.ir

More information

Introduction to Arithmetic Geometry Fall 2013 Lecture #20 11/14/2013

Introduction to Arithmetic Geometry Fall 2013 Lecture #20 11/14/2013 18.782 Introduction to Arithmtic Gomtry Fall 2013 Lctur #20 11/14/2013 20.1 Dgr thorm for morphisms of curvs Lt us rstat th thorm givn at th nd of th last lctur, which w will now prov. Thorm 20.1. Lt φ:

More information

International Journal of Scientific & Engineering Research, Volume 6, Issue 7, July ISSN

International Journal of Scientific & Engineering Research, Volume 6, Issue 7, July ISSN Intrnational Journal of Scintific & Enginring Rsarch, Volum 6, Issu 7, July-25 64 ISSN 2229-558 HARATERISTIS OF EDGE UTSET MATRIX OF PETERSON GRAPH WITH ALGEBRAI GRAPH THEORY Dr. G. Nirmala M. Murugan

More information

4.2 Design of Sections for Flexure

4.2 Design of Sections for Flexure 4. Dsign of Sctions for Flxur This sction covrs th following topics Prliminary Dsign Final Dsign for Typ 1 Mmbrs Spcial Cas Calculation of Momnt Dmand For simply supportd prstrssd bams, th maximum momnt

More information

Properties of Quarks ( ) Isospin. π = 1, 1

Properties of Quarks ( ) Isospin. π = 1, 1 Proprtis of Quarks Isospin So far, w hav discussd thr familis of lptons but principally concntratd on on doublt of quarks, th u and d. W will now introduc othr typs of quarks, along with th nw quantum

More information

2F1120 Spektrala transformer för Media Solutions to Steiglitz, Chapter 1

2F1120 Spektrala transformer för Media Solutions to Steiglitz, Chapter 1 F110 Spktrala transformr för Mdia Solutions to Stiglitz, Chaptr 1 Prfac This documnt contains solutions to slctd problms from Kn Stiglitz s book: A Digital Signal Procssing Primr publishd by Addison-Wsly.

More information

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018 Propositional Logic Combinatorial Problm Solving (CPS) Albrt Olivras Enric Rodríguz-Carbonll May 17, 2018 Ovrviw of th sssion Dfinition of Propositional Logic Gnral Concpts in Logic Rduction to SAT CNFs

More information

ME 321 Kinematics and Dynamics of Machines S. Lambert Winter 2002

ME 321 Kinematics and Dynamics of Machines S. Lambert Winter 2002 3.4 Forc Analysis of Linkas An undrstandin of forc analysis of linkas is rquird to: Dtrmin th raction forcs on pins, tc. as a consqunc of a spcifid motion (don t undrstimat th sinificanc of dynamic or

More information

Category Theory Approach to Fusion of Wavelet-Based Features

Category Theory Approach to Fusion of Wavelet-Based Features Catgory Thory Approach to Fusion of Wavlt-Basd Faturs Scott A. DLoach Air Forc Institut of Tchnology Dpartmnt of Elctrical and Computr Enginring Wright-Pattrson AFB, Ohio 45433 Scott.DLoach@afit.af.mil

More information

Dealing with quantitative data and problem solving life is a story problem! Attacking Quantitative Problems

Dealing with quantitative data and problem solving life is a story problem! Attacking Quantitative Problems Daling with quantitati data and problm soling lif is a story problm! A larg portion of scinc inols quantitati data that has both alu and units. Units can sa your butt! Nd handl on mtric prfixs Dimnsional

More information

Where k is either given or determined from the data and c is an arbitrary constant.

Where k is either given or determined from the data and c is an arbitrary constant. Exponntial growth and dcay applications W wish to solv an quation that has a drivativ. dy ky k > dx This quation says that th rat of chang of th function is proportional to th function. Th solution is

More information

Mor Tutorial at www.dumblittldoctor.com Work th problms without a calculator, but us a calculator to chck rsults. And try diffrntiating your answrs in part III as a usful chck. I. Applications of Intgration

More information

Properties of Phase Space Wavefunctions and Eigenvalue Equation of Momentum Dispersion Operator

Properties of Phase Space Wavefunctions and Eigenvalue Equation of Momentum Dispersion Operator Proprtis of Phas Spac Wavfunctions and Eignvalu Equation of Momntum Disprsion Oprator Ravo Tokiniaina Ranaivoson 1, Raolina Andriambololona 2, Hanitriarivo Rakotoson 3 raolinasp@yahoo.fr 1 ;jacqulinraolina@hotmail.com

More information

Supplementary Materials

Supplementary Materials 6 Supplmntary Matrials APPENDIX A PHYSICAL INTERPRETATION OF FUEL-RATE-SPEED FUNCTION A truck running on a road with grad/slop θ positiv if moving up and ngativ if moving down facs thr rsistancs: arodynamic

More information

Thus, because if either [G : H] or [H : K] is infinite, then [G : K] is infinite, then [G : K] = [G : H][H : K] for all infinite cases.

Thus, because if either [G : H] or [H : K] is infinite, then [G : K] is infinite, then [G : K] = [G : H][H : K] for all infinite cases. Homwork 5 M 373K Solutions Mark Lindbrg and Travis Schdlr 1. Prov that th ring Z/mZ (for m 0) is a fild if and only if m is prim. ( ) Proof by Contrapositiv: Hr, thr ar thr cass for m not prim. m 0: Whn

More information

Einstein Equations for Tetrad Fields

Einstein Equations for Tetrad Fields Apiron, Vol 13, No, Octobr 006 6 Einstin Equations for Ttrad Filds Ali Rıza ŞAHİN, R T L Istanbul (Turky) Evry mtric tnsor can b xprssd by th innr product of ttrad filds W prov that Einstin quations for

More information

As the matrix of operator B is Hermitian so its eigenvalues must be real. It only remains to diagonalize the minor M 11 of matrix B.

As the matrix of operator B is Hermitian so its eigenvalues must be real. It only remains to diagonalize the minor M 11 of matrix B. 7636S ADVANCED QUANTUM MECHANICS Solutions Spring. Considr a thr dimnsional kt spac. If a crtain st of orthonormal kts, say, and 3 ar usd as th bas kts, thn th oprators A and B ar rprsntd by a b A a and

More information

surface of a dielectric-metal interface. It is commonly used today for discovering the ways in

surface of a dielectric-metal interface. It is commonly used today for discovering the ways in Surfac plasmon rsonanc is snsitiv mchanism for obsrving slight changs nar th surfac of a dilctric-mtal intrfac. It is commonl usd toda for discovring th was in which protins intract with thir nvironmnt,

More information

GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES. Eduard N. Klenov* Rostov-on-Don, Russia

GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES. Eduard N. Klenov* Rostov-on-Don, Russia GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES Eduard N. Klnov* Rostov-on-Don, Russia Th articl considrs phnomnal gomtry figurs bing th carrirs of valu spctra for th pairs of th rmaining additiv

More information

Problem Set #2 Due: Friday April 20, 2018 at 5 PM.

Problem Set #2 Due: Friday April 20, 2018 at 5 PM. 1 EE102B Spring 2018 Signal Procssing and Linar Systms II Goldsmith Problm St #2 Du: Friday April 20, 2018 at 5 PM. 1. Non-idal sampling and rcovry of idal sampls by discrt-tim filtring 30 pts) Considr

More information

u 3 = u 3 (x 1, x 2, x 3 )

u 3 = u 3 (x 1, x 2, x 3 ) Lctur 23: Curvilinar Coordinats (RHB 8.0 It is oftn convnint to work with variabls othr than th Cartsian coordinats x i ( = x, y, z. For xampl in Lctur 5 w mt sphrical polar and cylindrical polar coordinats.

More information

Linear Non-Gaussian Structural Equation Models

Linear Non-Gaussian Structural Equation Models IMPS 8, Durham, NH Linar Non-Gaussian Structural Equation Modls Shohi Shimizu, Patrik Hoyr and Aapo Hyvarinn Osaka Univrsity, Japan Univrsity of Hlsinki, Finland Abstract Linar Structural Equation Modling

More information

Title: Vibrational structure of electronic transition

Title: Vibrational structure of electronic transition Titl: Vibrational structur of lctronic transition Pag- Th band spctrum sn in th Ultra-Violt (UV) and visibl (VIS) rgions of th lctromagntic spctrum can not intrprtd as vibrational and rotational spctrum

More information

Learning Spherical Convolution for Fast Features from 360 Imagery

Learning Spherical Convolution for Fast Features from 360 Imagery Larning Sphrical Convolution for Fast Faturs from 36 Imagry Anonymous Author(s) 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 In this fil w provid additional dtails to supplmnt th main papr

More information

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES DONALD M. DAVIS Abstract. If p is a prim (implicit in notation and n a positiv intgr, lt ν(n dnot th xponnt of p in n, and U(n n/p ν(n, th unit

More information

EXPONENTIAL ENTROPY ON INTUITIONISTIC FUZZY SETS

EXPONENTIAL ENTROPY ON INTUITIONISTIC FUZZY SETS K Y B E R N E T I K A V O L U M E 4 9 0 3, N U M B E R, P A G E S 4 7 EXPONENTIAL ENTROPY ON INTUITIONISTIC FUZZY SETS Rajkumar Vrma and Bhu Dv Sharma In th prsnt papr, basd on th concpt of fuzzy ntropy,

More information

Approximation and Inapproximation for The Influence Maximization Problem in Social Networks under Deterministic Linear Threshold Model

Approximation and Inapproximation for The Influence Maximization Problem in Social Networks under Deterministic Linear Threshold Model 20 3st Intrnational Confrnc on Distributd Computing Systms Workshops Approximation and Inapproximation for Th Influnc Maximization Problm in Social Ntworks undr Dtrministic Linar Thrshold Modl Zaixin Lu,

More information

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula 7. Intgration by Parts Each drivativ formula givs ris to a corrsponding intgral formula, as w v sn many tims. Th drivativ product rul yilds a vry usful intgration tchniqu calld intgration by parts. Starting

More information

UNTYPED LAMBDA CALCULUS (II)

UNTYPED LAMBDA CALCULUS (II) 1 UNTYPED LAMBDA CALCULUS (II) RECALL: CALL-BY-VALUE O.S. Basic rul Sarch ruls: (\x.) v [v/x] 1 1 1 1 v v CALL-BY-VALUE EVALUATION EXAMPLE (\x. x x) (\y. y) x x [\y. y / x] = (\y. y) (\y. y) y [\y. y /

More information

SCHUR S THEOREM REU SUMMER 2005

SCHUR S THEOREM REU SUMMER 2005 SCHUR S THEOREM REU SUMMER 2005 1. Combinatorial aroach Prhas th first rsult in th subjct blongs to I. Schur and dats back to 1916. On of his motivation was to study th local vrsion of th famous quation

More information

Searching Linked Lists. Perfect Skip List. Building a Skip List. Skip List Analysis (1) Assume the list is sorted, but is stored in a linked list.

Searching Linked Lists. Perfect Skip List. Building a Skip List. Skip List Analysis (1) Assume the list is sorted, but is stored in a linked list. 3 3 4 8 6 3 3 4 8 6 3 3 4 8 6 () (d) 3 Sarching Linkd Lists Sarching Linkd Lists Sarching Linkd Lists ssum th list is sortd, but is stord in a linkd list. an w us binary sarch? omparisons? Work? What if

More information

Slide 1. Slide 2. Slide 3 DIGITAL SIGNAL PROCESSING CLASSIFICATION OF SIGNALS

Slide 1. Slide 2. Slide 3 DIGITAL SIGNAL PROCESSING CLASSIFICATION OF SIGNALS Slid DIGITAL SIGAL PROCESSIG UIT I DISCRETE TIME SIGALS AD SYSTEM Slid Rviw of discrt-tim signals & systms Signal:- A signal is dfind as any physical quantity that varis with tim, spac or any othr indpndnt

More information

Lecture 37 (Schrödinger Equation) Physics Spring 2018 Douglas Fields

Lecture 37 (Schrödinger Equation) Physics Spring 2018 Douglas Fields Lctur 37 (Schrödingr Equation) Physics 6-01 Spring 018 Douglas Filds Rducd Mass OK, so th Bohr modl of th atom givs nrgy lvls: E n 1 k m n 4 But, this has on problm it was dvlopd assuming th acclration

More information

4037 ADDITIONAL MATHEMATICS

4037 ADDITIONAL MATHEMATICS CAMBRIDGE INTERNATIONAL EXAMINATIONS GCE Ordinary Lvl MARK SCHEME for th Octobr/Novmbr 0 sris 40 ADDITIONAL MATHEMATICS 40/ Papr, maimum raw mark 80 This mark schm is publishd as an aid to tachrs and candidats,

More information

ph People Grade Level: basic Duration: minutes Setting: classroom or field site

ph People Grade Level: basic Duration: minutes Setting: classroom or field site ph Popl Adaptd from: Whr Ar th Frogs? in Projct WET: Curriculum & Activity Guid. Bozman: Th Watrcours and th Council for Environmntal Education, 1995. ph Grad Lvl: basic Duration: 10 15 minuts Stting:

More information

Middle East Technical University Department of Mechanical Engineering ME 413 Introduction to Finite Element Analysis

Middle East Technical University Department of Mechanical Engineering ME 413 Introduction to Finite Element Analysis Middl East Tchnical Univrsity Dpartmnt of Mchanical Enginring ME 43 Introduction to Finit Elmnt Analysis Chaptr 3 Computr Implmntation of D FEM Ths nots ar prpard by Dr. Cünyt Srt http://www.m.mtu.du.tr/popl/cunyt

More information

First derivative analysis

First derivative analysis Robrto s Nots on Dirntial Calculus Chaptr 8: Graphical analysis Sction First drivativ analysis What you nd to know alrady: How to us drivativs to idntiy th critical valus o a unction and its trm points

More information

EXST Regression Techniques Page 1

EXST Regression Techniques Page 1 EXST704 - Rgrssion Tchniqus Pag 1 Masurmnt rrors in X W hav assumd that all variation is in Y. Masurmnt rror in this variabl will not ffct th rsults, as long as thy ar uncorrlatd and unbiasd, sinc thy

More information

4. Money cannot be neutral in the short-run the neutrality of money is exclusively a medium run phenomenon.

4. Money cannot be neutral in the short-run the neutrality of money is exclusively a medium run phenomenon. PART I TRUE/FALSE/UNCERTAIN (5 points ach) 1. Lik xpansionary montary policy, xpansionary fiscal policy rturns output in th mdium run to its natural lvl, and incrass prics. Thrfor, fiscal policy is also

More information

A Prey-Predator Model with an Alternative Food for the Predator, Harvesting of Both the Species and with A Gestation Period for Interaction

A Prey-Predator Model with an Alternative Food for the Predator, Harvesting of Both the Species and with A Gestation Period for Interaction Int. J. Opn Problms Compt. Math., Vol., o., Jun 008 A Pry-Prdator Modl with an Altrnativ Food for th Prdator, Harvsting of Both th Spcis and with A Gstation Priod for Intraction K. L. arayan and. CH. P.

More information

Differential Equations

Differential Equations Prfac Hr ar m onlin nots for m diffrntial quations cours that I tach hr at Lamar Univrsit. Dspit th fact that ths ar m class nots, th should b accssibl to anon wanting to larn how to solv diffrntial quations

More information