Full Disjunctions: Polynomial-Delay Iterators in Action

Size: px
Start display at page:

Download "Full Disjunctions: Polynomial-Delay Iterators in Action"

Transcription

1 Full Disjunctins: Plynmial-Delay Iteratrs in Actin Sara Chen Technin Haifa, Israel Itzhak Fadida Technin Haifa, Israel Yarn Kanza University f Trnt Trnt, Canada yarn@cs.trnt.edu ABSTRACT Benny Kimelfeld The Hebrew University Jerusalem, Israel bennyk@cs.huji.ac.il Full disjunctins are an assciative extensin f the uterjin peratr t an arbitrary number f relatins. Their main advantage is the ability t maximally cmbine data frm different relatins while preserving all the riginal infrmatin. An algrithm fr efficiently cmputing full disjunctins is presented. This algrithm is superir t previus nes in three ways. First, it is the first algrithm that cmputes a full disjunctin with a plynmial delay between tuples. Hence, it can be implemented as an iteratr that prduces a stream f tuples, which is imprtant in many cases (e.g., pipelined query prcessing and Web applicatins). Secnd, the ttal runtime is linear in the size f the utput. Third, the algrithm emplys a nvel ptimizatin that divides the relatin schemes int bicnnected cmpnents, uses a separate iteratr fr each cmpnent and applies uterjins whenever pssible. Cmbining efficiently full disjunctins with standard SQL peratrs is discussed. Experiments shw the superirity f ur algrithm ver the state f the art. 1. INTRODUCTION In many scenaris, databases that were built independently cntain related infrmatin. The Internet has made many such databases accessible and enhanced the need fr reliable and efficient data-integratin techniques. The gal f data integratin is t enable users t utilize all the available infrmatin frm independent data surces. T that end, integratin techniques shuld cnnect related pieces f infrmatin in a maximal way, that is, all and nly related pieces shuld be cmbined withut causing any lss f infrmatin. Integratin f tw relatins can be dne using the natural uterjin peratr (r uterjin fr shrt) [4, 5, 9]. Hwever, the uterjin is nt suitable fr integrating mre than tw Permissin t cpy withut fee all r part f this material is granted prvided that the cpies are nt made r distributed fr direct cmmercial advantage, the VLDB cpyright ntice and the title f the publicatin and its date appear, and ntice is given that cpying is by permissin f the Very Large Data Base Endwment. T cpy therwise, r t republish, t pst n servers r t redistribute t lists, requires a fee and/r special permissin frm the publisher, ACM. VLDB 6, September 12-15, 26, Seul, Krea. Cpyright 26 VLDB Endwment, ACM /6/9. Yehshua Sagiv The Hebrew University Jerusalem, Israel sagiv@cs.huji.ac.il relatins; fr example, the uterjin is nt assciative. The full disjunctin [7] is an assciative and cmmutative generalizatin f the uterjin peratr. Hence, it is suitable fr integrating any number f relatins. Intuitively, fr a given set f relatins, the full disjunctin is btained by jining maximal sets f related tuples (i.e., tuples that agree n all the cmmn attributes). Since every given tuple is included in at least ne tuple f the result, all the available infrmatin is preserved. In cmparisn, the tuples prduced by a sequence f uterjins include all the riginal infrmatin, but are nt necessarily maximal, i.e., uterjins cannt always cmbine all related pieces f infrmatin. Thus, in general, results f uterjins may have redundant tuples r suffer frm infrmatin lss (since sme related tuples f the surce relatins are nt jined in the result). Our gal is t prvide practically efficient algrithms fr cmputing full disjunctins. Such algrithms are essential fr incrprating full disjunctins int standard query languages. The next example demnstrates the imprtance f a full-disjunctin peratr as a primitive language cnstruct. Example 1.1. Suppse that the fllwing relatins are stred in a database cntaining turistic infrmatin: Climates(Cuntry, Climate), Accmmdatins(Cuntry, City, Htel, Stars) and Sites(Cuntry, City, Site). We wuld like t find a place t visit, based n the infrmatin f ur database. The natural jin f the three relatins abve may nt yield the desired result, since a jin may cause interesting data t be lst due t missing infrmatin in the tables. Fr sme cities, fr example, the database may cntain n sites at all. Fr such cities, there is n crrespnding tuple in the Sites relatin. As anther example, sme natinal sites may be ut f any specific city and, hence, the tuples f these sites will nt jin with any tuple f Accmmdatins. T prevent lss f infrmatin, we wuld like t be able t pse the fllwing query. SELECT Cuntry,City,Stars,Site FROM FD(Climates,Accmmdatins,Sites) AS F WHERE F.Climate = trpical ORDER BY Stars In this query, FD(Climates,Accmmdatins,Sites) is an peratin that cmputes the full disjunctin f its arguments. Intuitively, this query returns maximal infrmatin abut turist sites and accmmdatins in cuntries with trpical climate, ranked by the rating f the htels. 739

2 When evaluating queries, such as that f Example 1.1, an efficient algrithm fr cmputing full disjunctins is clearly needed. The best knwn algrithm fr cmputing full disjunctins [2] runs in time that is quadratic in the size f the utput. Since the utput f a full disjunctin is typically large (it includes, at least, all the tuples f the surce relatins), it is clearly desirable t reduce the dependency t ptimum, i.e., achieve linear time in the size f the utput (and, f curse, plynmial time in the size f the input). One way f achieving linear dependency in the utput is t devise an algrithm that generates the tuples f the result incrementally, where the time interval between generating a tuple f the result and generating the next tuple is small. Frmally, we say that an algrithm runs in plynmial delay if the delay between successive tuples is plynmial nly in the size f the input. An algrithm that runs with plynmial delay has a ttal time that is linear in the size f its utput. In this paper, we present an algrithm fr cmputing full disjunctins that is aimed at achieving small delays between tuples. In particular, we present the first algrithm fr cmputing full disjunctins with plynmial delay. Cmputing full disjunctins with shrt delays between tuples is in itself an imprtant gal (regardless f the ttal cmputatin time), since there are applicatins that can benefit frm getting tuples f the result as a stream that starts flwing as sn as pssible, befre the whle cmputatin is cmplete. A mtivating example is a parallel system, where ne prcessr cmputes the tuples f a full disjunctin and anther prcessr receives these tuples fr further prcessing. A shrt delay is required t achieve a prper utilizatin f the secnd prcessr. As anther example, cnsider a server n the Web that cmputes a full disjunctin and then sends the result ver the Internet. While the cmputatin is ging n, tuples are sent as sn as they are prduced, thereby reducing latency and preventing bandwidth bttlenecks. A third imprtant example is when a user inspects the tuples ne at a time, as sn as they are generated, and may even decide at any mment that n mre tuples are needed and the cmputatin can terminate. In such a scenari, the system shuld nt let the user wait fr t lng when she is asking fr the next tuple. Thus, minimizing the delay is imprtant and it may even save resurces, since cmputing the whle result is nt always required. Related Wrk. In sme cases, full disjunctins can be frmulated as sequences f uterjins. In [13], it was shwn that the full disjunctin f a given set f relatins is equivalent t sme sequence f uterjins if and nly if the relatin schemes frm a γ-acyclic hypergraph [6]. Based n this result, an algrithm fr efficiently cmputing full disjunctins f γ-acyclic sets f relatins by a sequence f uterjins was prpsed in [13]. An algrithm that cmputes full disjunctins f an arbitrary set f relatins was presented in [12]. The abve tw algrithms run in plynmial time in the size f the input and the utput. In ther wrds, all the tuples f the result are returned in time that is plynmial in the cmbined size f the given relatins and their full disjunctin. Nte, hwever, that these algrithms d nt return any tuple until all prcessing is cmplete (and cannt be easily adapted t d s). In [2], an incremental algrithm fr cmputing full disjunctins was presented. In this algrithm, the delay is shrt while the first several tuples are generated, but it gets lnger as the cmputatin ges n. Frmally, the cmputatin is in incremental plynmial time, i.e., the delay between the ith and the (i + 1)st tuples f the result is plynmial in the cmbined size f the input and the first i tuples that are generated. Nte, hwever, that the substantial slwdwn between successive tuples is nt suitable in the scenaris described earlier. The algrithm f [2] can als be adapted t generating tuples in ranked rder prvided that the ranking functin is mntnically c-determined. Cntributins. Our main cntributin is the algrithm BiCmNLOJ fr efficiently cmputing full disjunctins in the general case. This algrithm is superir t previus nes in three ways. Fremst, it is the first algrithm that cmputes a full disjunctin with plynmial delay. Secnd, the ttal runtime is linear in the size f the utput, whereas the best previusly knwn algrithm [2] runs in time that is quadratic in the size f the utput. Third, by dividing the scheme graph f the relatins int bicnnected cmpnents, ur algrithm can further ptimize the runtime (and shrten the delay) by incrprating uterjins int the cmputatin. Befre discussing BiCmNLOJ, we first present tw algrithms, namely NLOJ and PDelayFD, that we build upn t derive this general algrithm. Sectin 3 describes the algrithm NLOJ that uses left-deep uterjins t cmpute a full disjunctin f an acyclic 1 set f relatins. This algrithm is highly efficient with linear delay (in the size f the input) between results. The algrithm PDelayFD (Sectin 4) cmputes a full disjunctin fr an arbitrary set f relatins with a plynmial delay between tuples. We nte that PDelayFD is already a significant imprvement ver the state f the art [2], since it runs in plynmial delay and in ttal time that is linear (rather than quadratic) in the size f the utput. Finally, BiCmNLOJ, which cmbines the efficiency f NLOJ with the generality f PDelayFD, is presented (Sectin 5). The cmbinatin f NLOJ and PDelayFD is rather intricate, since special techniques must be applied t retain the plynmial delay between tuples. In this paper, we als shw hw full disjunctins can be integrated with standard SQL peratrs (Sectin 6). In particular, we shw hw and when selectin, prjectin and rder-by can be integrated directly int the prcessing f a full disjunctin. All algrithms discussed in this paper, as well as the algrithm f [2], were implemented as blck-based algrithms within an pen-surce database system. Extensive experimentatin was perfrmed t prve the superirity f Bi- CmNLOJ ver the state-f-art algrithm ([2]) in the ttal cmputatin time as well as the delays between tuples in the result (Sectin 7). Our experimentatins als shw that the technique emplyed in BiCmNLOJ f dividing the scheme graph int bicnnected cmpnents is f great imprtance fr deriving an algrithm that can be efficiently integrated int a real database system. 2. PRELIMINARIES In this sectin, we present ur data mdel, we define the cncept f a full disjunctin, and we discuss enumeratin algrithms that act as iteratrs. 2.1 Relatins and Schemes In this wrk, we use the relatinal data mdel. Relatins, 1 Our ntin f acyclicity is defined in Sectin

3 rel. R 1 R 2 R 3 R 4 R 5 R 6 R 7 R 8 R 9 R 1 attributes A, B,C, D B, E A, E,F, G F, H G, H C, I D, J, K K, L J, L, M M, N sc(r 5) sc(r 4) sc(r 2) sc(r 6) sc(r 3) sc(r 1) sc(r 8) sc(r 7) sc(r 9) sc(r 1) A B C t t t (a) R 11 A E F v v (c) R 13 A B D u u (b) R 12 A E G w w (d) R 14 (a) Schemes f R 1 rel. attributes R a A, B, C, D, E, F, G R b A, E, F, G, H R c C, I R d D, J, K, L, M R e M, N (c) Schemes f R 1a (b) Scheme graph G sc(r 1) sc(r b ) sc(r a) sc(r c) sc(r d ) sc(r e) (d) Scheme graph G sc(r 1a ) Figure 2: Relatins R 11, R 12, R 13 and R 14 mst ne tuple frm each relatin (hence, m n). We say that T is cnnected if the tuples f T belng t a cnnected subset f R; that is, if {R i1,..., R im } is cnnected, where each R ij is the relatin that cntains t j. We say that T is jin cnsistent if every tw tuples in T are jin cnsistent; that is, fr every 1 j < k m, the tuples t j and t k are jin cnsistent. Fr clarity, we assume that different tuples frm the same relatin are nt jin cnsistent. JC(T) dentes that T is jin cnsistent and JCC(T) dentes that T is bth jin cnsistent and cnnected. Figure 1: Schemes f relatin sets R 1 and R 1a tuples and schemes are defined in the usual way 2. Given a tuple t and an attribute A f t, we use t[a] t dente the value f t fr A. We use t dente the null value. Given a relatin R, we use sc(r) t dente the scheme f R, i.e., the set cmprising all the attributes f R. Given a tuple t 1 f a relatin R 1 and a tuple t 2 f anther relatin R 2, we say that t 1 and t 2 are jin cnsistent if they agree (i.e., are equal and nnnull) n cmmn attributes, that is, t 1[A] = t 2[A] hlds fr every attribute A in sc(r 1) sc(r 2). We use R (pssibly with a subscript) t dente a set f relatins. Cnsider a set R = {R 1,..., R n} f relatins. The scheme graph f R, dented G sc(r), is the undirected graph that is defined as fllws. The ndes f G sc(r) are the schemes sc(r 1),...,sc(R n). There is an edge between tw schemes if they share ne r mre cmmn attributes. Mre frmally, G sc(r) cntains an edge between sc(r i) and sc(r j) if i j and sc(r i) sc(r j). We say that R is cnnected if G sc(r) is cnnected. R is cyclic if G sc(r) cntains a cycle; therwise, it is acyclic. Nte that ur ntin f acyclicity implies γ-acyclicity, but the tw are nt equivalent. Example 2.1. The scheme graphs f tw sets f relatin R 1 = {R 1,..., R 1} and R 1a = {R a,..., R e} is depicted in Figure 1. The dtted plygns in Figure 1(b) shuld be ignred fr nw their meaning will be explained later. Nte that bth R 1 and R 1a are cnnected, R 1 is cyclic and R 1a is acyclic. 2.2 Full Disjunctins Cnsider a set f relatins R = {R 1,..., R n}. A tuple set f R is any set f tuples T = {t 1,..., t m} cnsisting f at 2 The algrithms in this paper assume that relatins are sets f tuples, but they can be easily adapted t the case where relatins are multisets f tuples. Definitin 2.2. Let R be a set f relatins. MaxJCC(R) is the set f all tuple sets T f R, such that (1) JCC(T), and (2) T is maximal, that is, n tuple set f R that is bth jin cnsistent and cnnected als prperly cntains T. Example 2.3. Let R = {R 11, R 12, R 13, R 14}, where R 11, R 12, R 13 and R 14 are the relatins f Figure 2. Cnsider the tuple set T = {t 1, u 1, v 1}. It is easy t see that JCC(T) hlds, since (1) the scheme graph f {R 11, R 12, R 13} is cnnected and (2) t 1, u 1 and v 1 agree n their cmmn attributes. Furthermre, T is in MaxJCC(R), since JCC(T ) des nt hld fr any tuple set T that prperly cntains T. The reader can easily verify that MaxJCC(R) cmprises the fllwing six tuple sets: {t 1, u 1, v 1}, {t 2, v 2, w 2}, {t 3, v 1}, {u 2, v 2, w 2}, {t 1, u 1, w 1} and {t 3, w 1}. Cnsider a jin-cnsistent tuple set T. Let S be a scheme that cntains all the attributes that appear in the tuples f T and perhaps additinal attributes. We use embed S(T) t dente the tuple ver S that is btained by first applying the natural jin t the tuples f T and then adding clumns with nulls fr the remaining attributes f S. In ther wrds, embed S(T) is the tuple t, such that fr all attributes A f S, if T cntains a tuple t with the attribute A, then t[a] = t [A]; therwise, t[a] =. We use embed S(t) as a shrthand ntatin fr embed S({t}). As an example, let T = {t 1, u 1, v 1} be the tuple set frm Example 2.3. Suppse that S is the set f all attributes appearing in the relatins R 11, R 12, R 13 and R 14 f Figure 2, i.e., S = {A, B, C, D, E, F, G}. Then, embed S(T) is the tuple t = (A : 1, B : 1, C : 1, D : 1, E : 11, F : 1, G : ). Nte that t[g] =, since G des nt appear in R 11, R 12 r R 13. Given a set f relatins R, the full disjunctin f R is btained by first applying the natural jin t the tuples f each T MaxJCC(R) and then taking the uter unin. Fllwing is a frmal definitin. Definitin 2.4 (Full Disjunctin). Let R be a set 741

4 A B C D E F G s s s s s s Figure 3: FD({R 11, R 12, R 13, R 14}) f relatins and S be the unin f the schemes f the relatins in R. The full disjunctin f R, dented FD(R), is the relatin {embed S(T) T MaxJCC(R)}. As an example, cnsider again the relatins R 11, R 12, R 13 and R 14, presented in Figure 2. The full disjunctin f these relatins cnsists f 6 tuples and is shwn in Figure 3. Tuple s 1, fr instance, is the tuple embed S({t 1, u 1, v 1}), where S = {A, B, C, D, E, F, G}. Our definitin f full disjunctins differs frm that f [13] in that we allw the surce relatins t cntain null values. When surce relatins d nt have null values, the tw definitins are equivalent. 2.3 Enumeratin Algrithms In this paper, we develp enumeratin algrithms fr cmputing full disjunctins. An enumeratin algrithm E is an algrithm that acts as an iteratr [1] but is written as an rdinary algrithm, i.e., next() and hasnext() functins are nt defined explicitly (in Sectin 2.4 we explain hw iteratrs are used in enumeratin algrithms). An enumeratin algrithm generates, fr a given input x, a sequence e 1,..., e m f results. Each element e i is prduced by the peratin utput ( ). We say that E(x) enumerates the set S if, in the executin f E(x), every element f S is prduced exactly nce. Next, we discuss measures f efficiency fr enumeratin algrithms. Plynmial time cmplexity is nt a suitable yardstick f efficiency when analyzing an enumeratin algrithm. This is because the size f the utput culd be much larger (e.g., expnentially larger) than the size f the input. In [11], several definitins f efficiency fr enumeratin algrithms are discussed. The weakest definitin is plynmial ttal time, where the running time is plynmial in the cmbined size f the input and the utput. Tw strnger definitins cnsider the time that is needed fr generating the ith element, after the first i 1 elements have already been created. Incremental plynmial time means that the ith element is generated in time that is plynmial in the cmbined size f the input and the first i 1 elements. The strngest definitin is plynmial delay, where the ith element is generated in time that is plynmial nly in the input. It is easy t see that every algrithm that enumerates with plynmial delay als runs in ttal time that is linear in the size f the utput. Fr instance, if the delay is O(d), where d is plynmial in the size f the input, and there are m elements in the result, then the runtime will be O(m d). Nte that, in general, SQL queries cannt be evaluated in plynmial ttal time, i.e., the evaluatin may take expnential time if the size f the query is nt fixed. Fr example, the jin f a set f relatins cannt be evaluated in plynmial ttal time, since it is NP-cmplete just t determine whether the result f the jin is nnempty [1]. It was shwn that full disjunctins can be cmputed in plynmial ttal time [12] and even in incremental plynmial time [2]. 2.4 Iteratrs We use iteratrs in ur enumeratin algrithms. An iteratr is a prgramming cnstruct that makes it pssible t iterate ver the results f an enumeratin algrithm, and even pause the cmputatin while retaining the internal state. Iteratrs are needed in rder t realize plynmial delay when enumeratin algrithms use each ther. If an enumeratin algrithm Ê calls anther enumeratin algrithm E and then waits until E generates all f its utput, the delay culd be expnential. Instead, E shuld halt after generating and returning the first result t Ê. Later, E shuld resume its peratin, in rder t generate the next result when Ê is ready t get it. Frmally, an iteratr is an bject that perates n tp f a specific executin f an enumeratin algrithm. The iteratr utputs each result f the underlying algrithm nly upn an explicit next request. Cnsider an enumeratin algrithm E and an input x. An iteratr I ver E(x) is cnstructed by the peratin I new Iteratr(E, x). Suppse that E(x) enumerates {e 1,..., e m}. When executing I.next() fr the first time, the executin f E(x) starts and cntinues until the first result e 1 is generated. Specifically, the cmmand utput (Y ) in the executin f E(x) returns bth the value f Y and the cntrl f the executin t the prcedure that called I.next(). Nte that the value f Y is returned t the user (i.e., printed) nly when utput (Y ) is executed by the utermst prcedure. When I.next() is executed again, the executin f E(x) is resumed until e 2 is generated and s n. Hence, the elements e 1,..., e m are enumerated by repeatedly executing m times the cmmand I.next(). The cst f executing these m calls t I.next() is the ttal cst f executing E(x). The peratin I.hasNext() returns true if there are mre elements t be generated; therwise, it returns false. Nte that I can implement hasnext() by actually cntinuing the executin f E(x) t check if anther result is generated. In ur algrithms, hwever, hasnext() can be implemented by a simple inspectin f the data structures. 3. ACYCLIC SETS OF RELATIONS Full disjunctins are an assciative generalizatin f binary uterjins t any number f relatins. Left-deep sequences f uterjins (r just left-deep uterjins fr shrt) are als a generalizatin f uterjins t any number f relatins, but they are nt assciative. Nevertheless, we will shw that in cmmn cases, namely, when the given set f relatins is acyclic, full disjunctins can be frmulated as left-deep uterjins. Fr these cases, a plynmial-delay algrithm is presented in this sectin. This algrithm frms the basis f ur algrithm fr the general case (presented in Sectin 5). We nte in passing that [13] shwed hw t cmpute full disjunctins fr a special case using uterjins. Hwever, their algrithm did nt run with plynmial delay. See Sectin 1 fr details. We use R 1 R 2 t dente the uterjin f R 1 and R 2. When R 1 and R 2 are cnnected, the uterjin and the full disjunctin f the tw relatins are the same. If, hwever, R 1 and R 2 d nt share any attribute and bth are nnempty, then the uterjin is the Cartesian prduct whereas 742

5 NLOJ((R 1,..., R n)) 1: if n = 1 then 2: utput all tuples f R n 3: else 4: S := sc(r 1) sc(r n) 5: I := new Iteratr(NLOJ,(R 1,..., R n 1)) 6: while I.hasNext() d 7: ˆt := I.next() 8: fr all tuples t R n, s.t. JC({t, ˆt}) d 9: mark t 1: utput embed S({t, ˆt}) 11: if t R n, s.t. JC({t, ˆt}) then 12: utput embed S(ˆt) 13: fr all unmarked tuples t R n d 14: utput (embed S(t)) Figure 4: Cmputing (R 1,..., R n) the full disjunctin is the uterunin f the tw relatins. We use (R 1,..., R n) t dente the left-deep uterjin f R 1,..., R n. Frmally, (R 1, R 2) def = (R 1 R 2), and fr n > 2, recursively, (R 1, R 2, R 3,..., R n) def = (R 1 R 2, R 3,..., R n). Fr example, (R 1, R 2, R 3, R 4) def = ((R 1 R 2) R 3) R 4. Cnsider a cnnected set f relatins R = {R 1,..., R n}. A cnnected-prefix rdering f R is an rdering R 1,..., R n f the relatins f R, such that {R 1,..., R i} is cnnected fr all 1 i n. Nte that every cnnected set f relatins has a cnnected-prefix rdering. Mrever, this rdering can easily be created (e.g., by a DFS traversal f the scheme graph). As an example, ne cnnected-prefix rdering f the relatin set R 1a f Figure 1(d) is R c, R a, R b, R d, R e. Prpsitin 3.1 shws that when the scheme graph is a tree, a cnnected-prefix rdering yields a left-deep uterjin that is equivalent t the full disjunctin. This can be prven using the results f [13]. Prpsitin 3.1. Let R be a cnnected and acyclic set f relatins. If R 1,..., R n is a cnnected-prefix rdering f R, then FD(R) = (R 1,..., R n). We nw describe the algrithm NLOJ (Nested-Lp OuterJin) fr cmputing left-deep uterjins with plynmial delay. This algrithm is shwn in Figure 4. The input t the algrithm NLOJ cnsists f a tuple (R 1,..., R n) f relatins, where the rder f the relatins is a cnnected-prefix rdering. The utput is the relatin (R 1,..., R n). Essentially, this algrithm is similar t the nested-lp jin, but it als generates tuples that cannt be jined. In rder t simplify the presentatin, NLOJ is presented as a recursive algrithm. In ur implementatin f NLOJ (and f all the ther algrithms cnsidered in this paper), we used a nnrecursive versin, t imprve the efficiency. When the input cnsists f nly ne relatin (i.e., n = 1), all the tuples f that relatin are enumerated (Line 2). When n > 1, all the tuples f (R 1,..., R n 1) are recursively enumerated (Lines 5 14) by the recursive iteratr I cnstructed PDelayFD(R) 1: let R be an arbitrary relatin in R 2: Q R an empty queue 3: fr all t R d 4: TupExtFD(R, R,t, Q R) 5: RelExcFD(R, R, Q R) Figure 5: Cmputing FD(R) in Line 5. Fr each enumerated tuple ˆt, the algrithm iterates, in Line 8, ver all the tuples t R n that can be jined with ˆt. In Lines 9 1, each such tuple t is marked and its natural jin with ˆt is sent t the utput. If n such tuple t exists, then in Line 12 the algrithm utputs ˆt after padding it with nulls fr the attributes f sc(r n) that are missing frm ˆt. Finally, in Lines 13 14, the algrithm utputs all the tuples f R n that are nt marked (i.e., thse tuples that are never chsen in Line 8). Each such tuple t is padded with nulls fr the attributes f R 1,..., R n that it des nt include. Nte that NLOJ can be viewed as a pipelined cmputatin f binary uterjins. The iteratr I prduces the results f the previus stage in the pipeline. It is easy t shw that this algrithm has an O(N) delay, where N is the ttal number f tuples in the relatins. Furthermre, Line 8 frms the bttleneck f the cmputatin. Thus, by using standard jin techniques (e.g., nes that are based n indices), the delay can be substantially reduced. Such techniques are beynd the scpe f this paper. Frm Prpsitin 3.1, we cnclude that NLOJ can be used fr cmputing full disjunctins f acyclic and cnnected sets f relatins. If the given set f relatins is nt cnnected, then we can separately cmpute the full disjunctin f each cnnected cmpnent and pad each generated tuple with nulls fr all the attributes f the ther cnnected cmpnents. In ther wrds, we cmpute the full disjunctins f the different cmpnents and then cmpute the uterunin f these full disjunctins. Given that N is the ttal number f tuples in the relatins f R, the delay f NLOJ is O(N) as frmally stated in the fllwing therem. Therem 3.2. Given an acyclic set f relatins R cntaining N tuples altgether, the algrithm NLOJ cmputes FD(R) with O(N) delay. Fr sme cyclic sets f relatins, the full disjunctin is still equivalent t a left-deep sequence f uterjins; hwever, characterizing these cases is beynd the scpe f this paper. In such cases, the full disjunctin can be cmputed by NLOJ. In Sectin 5, we use NLOJ as part f an algrithm fr cmputing full disjunctins in the general case. 4. GENERAL SETS OF RELATIONS In this sectin, we present PDelayFD which is the first algrithm that cmputes full disjunctins f general sets f relatins with plynmial delay. By definitin, having plynmial delay guarantees that the ttal time f this algrithm is linear in the size f the utput. This imprves the best knwn upper bund [2] fr cmputing full disjunctins which is quadratic in the size f the utput. 743

6 TupExtFD(R, R,t, Q R) 1: C t := 2: Q t := {ExtendTMax(R, {t})} 3: S := R R sc(r ) 4: while Q t is nt empty d 5: remve the tp element T frm Q t 6: utput (embed S(T)) 7: C t.insert(t) 8: fr all s R \ {R} d 9: let T s be the maximal tuple set, s.t. s T s, T s T {s} and JCC(T s) 1: ˆT :=ExtendTMax(R, Ts) 11: if t ˆT and / C t Q t then 12: Q t.insert( ˆT) 13: if ˆT R = and ˆT / Q R then 14: Q R.insert( ˆT) ExtendTMax(R, T) 1: ˆT := T 2: fr all R R d 3: if R ˆT = then visited[r]:=false 4: else visited[r]:=true 5: while there is R R, s.t. visited[r]=false and R and ˆT share a cmmn attribute d 6: visited[r]:=true 7: fr all t R d 8: if JCC( ˆT {t}) then 9: ˆT := ˆT {t} 1: break 11: return ˆT Figure 7: Extending a tuple set Figure 6: Cmputing FD t (R) Cnsider a relatin R R. The set cmprising all tuple sets in MaxJCC(R) that cntain a given tuple t R is dented by MaxJCC t (R). The subset f FD(R) that crrespnds t MaxJCC t (R) is dented by FD t (R); that is, FD t (R) is the relatin {embed S(T) T MaxJCC t (R)}, where S is the unin f all the schemes f the relatins f R. The set cnsisting f all tuple sets in MaxJCC(R) that cntain n tuple frm R is dented by MaxJCC R (R). The crrespnding subset f FD(R) is dented by FD R (R). Clearly, FD(R) = FD R (R) t R FD t (R). We cmpute FD(R) in the fllwing manner. We first chse a relatin R R. We then enumerate FD t (R) fr all tuples t R. Finally, we enumerate FD R (R). This is basically the algrithm PDelayFD f Figure 5. T explain this algrithm, we examine the algrithms that it calls. We first describe the algrithm TupExtFD f Figure 6. The input f this algrithm cnsists f a set R f relatins, a relatin R R, a tuple t R and a queue Q R. This algrithm cmputes FD t (R) with plynmial delay. It uses (in Lines 2 and 1) the algrithm ExtendTMax f Figure 7 that, given a set R f relatins and a jin cnsistent and cnnected set T f tuples in R, returns an arbitrary maximal set ˆT f tuples in R that satisfy JCC( ˆT) and T ˆT. In particular, if T is a singletn {t}, then ExtendTMax(R, T) returns an arbitrary tuple set f MaxJCC t (R). The algrithm ExtendTMax wrks as fllws. It starts with T as the arbitrary set ˆT. It marks as visited all the relatins f R that cntain a tuple f T (Lines 2 4). Then, it iteratively extends ˆT while keeping ˆT cnnected and jin cnsistent (Lines 5 1). In each iteratin, a tuple is taken frm a relatin that has nt yet been visited (hence, ˆT never includes tw tuples frm the same relatin). The tuple is chsen arbitrarily frm a relatin that has a shared attribute with ˆT, since tuples f ther relatins are nt cnnected t ˆT. Only tuples that are cnnected and jin cnsistent with ˆT are added t ˆT (Lines 8 9). The iteratins cntinue until ˆT cannt be extended anymre. We cntinue nw with the descriptin f TupExtFD. The algrithm TupExtFD uses a repsitry C t fr string tuple sets that have been printed and a queue Q t fr string tu- ple sets awaiting t be printed. Initially, C t is empty and Q t is a singletn cntaining ne tuple set f MaxJCC t (R). Lines 5 14 are then executed repeatedly, until Q t is empty. In Lines 5 7, a tuple set T Q t is mved frm Q t t C t, transfrmed int the crrespnding tuple f FD(R) (by jining and adding nulls) and printed. Then, Lines 9 14 are repeatedly executed fr each tuple s f sme relatin in R\{R} (by a slight abuse f ntatin, we write s R \ {R}). In these lines, the algrithm tries t add the tuple s t the set T. Since JCC(T {s}) des nt necessarily hld, the maximal amng all subsets f T T {s} satisfying s T and JCC(T ) is prduced. Let T s dente this set. It is easy t shw that T s is unique. 3 In Line 1, T s is extended t a maximal tuple set ˆT MaxJCC(R) by executing Extend- TMax(R, T s). In Lines 11 12, the algrithm tests whether t is cntained in ˆT. If s, then ˆT is added t Q t, prvided that ˆT is in neither C t nr Q t. Lines are used fr enumerating FD R (R) and are discussed in detail later. By nw, we have described hw t enumerate FD t (R) with plynmial delay. Thus, FD(R) can be enumerated with plynmial delay if FD R (R) can be likewise enumerated. Hwever, the fllwing prpsitin indicates that this is nt likely t be true (and ne has t expect an expnential first delay when cmputing FD R (R)). Prpsitin 4.1. Testing whether FD R (R) is NPcmplete. T vercme this prblem, we use the fact that the relatin FD R (R) is cmputed nly after generating the relatins FD t (R) fr all t R. It turns ut that we can cllect enugh infrmatin, during the enumeratin f all the FD t (R), in rder t enumerate FD R (R) with plynmial delay. In particular, in Lines f TupExtFD, we test whether the maximal tuple set ˆT is in MaxJCC R (R). If s, then we stre ˆT in a queue Q R. Nte that in the algrithm PDelayFD, all executins f TupExtFD use the same Q R and, furthermre, Q R is given by reference. Finally, Q R (that cntains all the cllected tuple sets) is used 3 The tuple set T s can be btained frm T {s} as fllws. First, remve all the tuples t such that {s, t } is nt jin cnsistent (i.e., fr sme attribute A, either s[a] t [A] r s[a] = t [A] = ). In particular, t is remved if it is frm the same relatin as s. Then, remve tuples that are n lnger in the same cnnected cmpnent as s. 744

7 RelExcFD(R, R, Q R) 1: C R := 2: S := R R sc(r ) 3: while Q R is nt empty d 4: remve the tp element T frm Q R 5: utput (embed S(T)) 6: C R.insert(T) 7: fr all s R \ {R} d 8: let T s be the maximal tuple set, s.t. s T s, T s T {s} and JCC(T s) 9: ˆT :=ExtendTMax(R, Ts) 1: if ˆT R = and ˆT / Q R C R then 11: Q R.insert( ˆT) Figure 8: Cmputing FD R (R) fr enumerating FD R (R) with plynmial delay by applying the algrithm RelExcFD (shwn in Figure 8). The algrithm RelExcFD is very similar t TupExtFD, except fr the fllwing small differences. RelExcFD extends tuples f Q R (instead f tuples f Q t). Nte that RelExcFD des nt use Q t at all, since the tuples f all the FD t (R) have already been printed. RelExcFD uses C R t stre tuple sets that have already been remved frm Q R and printed. In cmparisn, TupExtFD des nt remve any element frm Q R and, s, it des nt need C R. Tuple sets ˆT, which are created by extending T s in Line 9, are inserted int Q R nly if they d nt cntain any tuple f R and are in neither Q R nr C R. Next, we prve the crrectness f the algrithm PDelayFD. Fr that, we use the fllwing lemma. Lemma 4.2. Cnsider a set f relatins R and a relatin R R. 1. Fr a tuple t R and a queue Q R MaxJCC R (R), TupExtFD(R, R, t, Q R) enumerates FD t (R). 2. Suppse that Q R is initially an empty queue. After executing TupExtFD(R, R, t, Q R) fr all tuples t R, RelExcFD(R, R, Q R) enumerates FD R (R). Prf. We will first prve Claim 1. Cnsider an executin f the algrithm TupExtFD(R, R,t, Q R). Lines 1 12 f the algrithm guarantee that nly tuples f FD t (R) are enumerated and that each tuple is printed at mst nce. It remains t prve that every tuple f FD t (R) is printed. Suppse, by way f cntradictin, that there is sme T MaxJCC t (R), such that embed S(T ) is nt printed by the algrithm. Clearly, the algrithm prints at least ne tuple set f MaxJCC t (R). Let T MaxJCC t (R) be a tuple set, amng all the printed nes, that yields a maximal T m, such that (1) t T m (2) T m T T, and (3) JCC(T m). Nte that T m culd be the set {t}. Since T is cnnected, there exists a tuple s T \T m, such that T m {s} is cnnected. Cnsider the iteratin f Lines 5 14 in which T is printed. Let T s be the tuple set that is generated in Line 9 when the tuple s is chsen in Line 8. Since T s is maximal and JCC(T m {s}), it fllws that T m {s} T s. Thus, T m {s} is cntained in the tuple set ˆT MaxJCC(R) that is generated in Line 1. Since t ˆT, ne f the fllwing three ptins must hld: (1) ˆT is inserted int Q t, (2) ˆT is in Q t, r (3) ˆT was in Q t in the past (and is nw in C t). Since the algrithm prints every tuple set that is inserted int Q t, the existence f ˆT cntradicts the chice f T and, hence, the existence f T. The prf f Claim 2 is similar, except fr the fllwing differences. First, we assume that T FD R(R). Secnd, amng all the tuple sets f MaxJCC(R) that are printed either in ne f the executins f TupExtFD r in the executin f RelExcFD, we chse a tuple set T that yields a maximal T m, such that (1) T m T T, and (2) JCC(T m). Third, the set ˆT is generated in either Line 1 f TupExtFD r Line 9 f RelExcFD. Nte that a cntradictin is btained immediately if ˆT FD R(R). Otherwise, we use the crrectness f Claim 1. Crrectness f the algrithm PDelayFD fllws immediately frm Lemma 4.2 and is frmally stated in the next therem. This therem als analyzes the running time f PDelayFD. We assume that Q t, C t, Q R and C R prvide lgarithmic access time; that is, insertin, remval and membership test require O(lg k) cmparisns, where k is the number f elements in the repsitry. Fr simplicity, ur analysis assumes that the relatins f R have a bunded number f clumns. Therefre, peratins such as testing jin cnsistency f tw tuples can be dne in cnstant time. We use n t dente the number f relatins in R and N t dente the ttal number f tuples in the relatins f R. Therem 4.3. PDelayFD(R) cmputes FD(R) with an O(n 2 N + N 2 + nn lg N) delay. Therem 4.3 states that PDelayFD cmputes full disjunctins with plynmial delay and in ttal time that is linear in the size f the utput, fr any arbitrary set f relatins. In bth these aspects, PDelayFD is superir t the state f the art algrithm [2]. 5. THE MAIN ALGORITHM The algrithm NLOJ fr cmputing left-deep uterjins is mre efficient than PDelayFD, since its delay is linear cmpared with the quadratic delay f PDelayFD. Hwever, NLOJ nly applies t acyclic sets f relatins. In fact, in [13], it was shwn that there are full disjunctins that cannt be frmulated as any sequence f uterjins (i.e., with arbitrary placement f parentheses). In this sectin, we shw that the prblem f cmputing full disjunctins can be decmpsed int smaller subprblems, such that PDelayFD is needed nly fr slving subprblems and NLOJ can cmpute the full disjunctin using the slutins f the subprblems. Fr nw, we assume that there are n null values in the input relatins. Tward the end f the sectin, we explain why this assumptin is needed and we prvide a simple slutin fr the case f input relatins with null values. Let R be a set f relatins. A subset R 1 f R is said t be bicnnected if fr each pair f relatins R 1, R 2 R 1, the scheme graph f R has (at least) tw nde-disjint paths cnnecting sc(r 1) and sc(r 2). A bicnnected cmpnent f R is a bicnnected subset R m that is maximal w.r.t. cntainment (that is, n bicnnected subset f R prperly cntains R m). Nte that a relatin can belng t mre than ne bicnnected cmpnent (such a relatin is an articulatin pint). Als nte that a bicnnected cmpnent may cntain nly ne relatin. Fr instance, if R is acyclic, then all 745

8 rel. R 15 R 16 R 17 R 18 R 19 R 2 R 21 attributes A, B, C A, D, E A, B, F E, G, H,I H, J I, J, K D, G (a) Schemes f R 4 sc(r 15) R f R g sc(r 2) sc(r 17) sc(r 16) sc(r 18) sc(r 19) sc(r 21) (b) Scheme graph G sc(r 4) Figure 9: Schemes f a relatin set R 4 R h the bicnnected cmpnents f R are singletns. It is knwn that bicnnected cmpnents can be efficiently cnstructed by detecting articulatin pints [3]. Example 5.1. The bicnnected cmpnents f R 1 (Figure 1(a)), surrunded by dtted plygns in the scheme graph f Figure 1(b), are the fllwing: R a = {R 1, R 2, R 3}, R b = {R 3, R 4, R 5}, R c = {R 6}, R d = {R 7, R 8, R 9} and R e = {R 1}. Cnsider a cnnected set R f relatins with the bicnnected cmpnents R 1,..., R k. We say that the sequence R 1,..., R k is a strng cnnected-prefix rder (abbr. SCP rder) if fr all 2 i k, the fllwing tw cnditins hld. First, R i has an attribute that appears in R 1 R i 1. Secnd, if R j (R 1 R i 1) fr sme i j n, then R i (R 1 R i 1). It is straightfrward t shw that an SCP rder f the bicnnected cmpnents exists (prvided that R is cnnected) and can be btained efficiently by the fllwing prcedure. First, chse an arbitrary cnnected cmpnent R 1. Having chsen R 1,..., R i 1 (and i k), check whether ne f the remaining cmpnents cntains a relatin R that is in R 1 R i 1. If s, chse R i t be such a cmpnent. Otherwise, chse R i t be a cmpnent that has an attribute A that appears in R 1 R i 1. In Example 5.1, R a, R b, R c, R d, R e is an SCP rder fr the bicnnected cmpnents f R 1. R a, R c, R b, R d, R e is nt an SCP rder, since amng R b and R c, nly R b cntains a relatin f R a. The next therem shws that the full disjunctin f R can be btained by first cmputing the full disjunctin f each bicnnected cmpnent R i and then applying a left-deep sequence f uterjins in an SCP rder. Nte that if R i is a singletn {R}, then FD(R i) is actually R itself. Therem 5.2. Suppse that R is a cnnected set f relatins and R 1,..., R k are the bicnnected cmpnents f R in an SCP rder. Then, FD(R) = (FD(R 1),...,FD(R k )). Nte that in Therem 5.2, the scheme graph f the relatins {FD(R 1,),...,FD(R k )} is nt necessarily acyclic. Thus, the prf is different frm that f Prpsitin 3.1 and mre intricate. Example 5.3. Figure 9 depicts the scheme f the relatin set R 4 = {R 15,..., R 21}. The bicnnected cmpnents f R 4, surrunded by dtted rectangles in Figure 9(b), are the sets R f, R g, and R h. It is easy t see that the sequence R f, R h, R g is an SCP rder f the three bicnnected cmpnents. It fllws frm Therem 5.2 that FD(R 4) = (FD(R f ),FD(R h ),FD(R g)). In cmparisn, R f, R g, R h is nt in an SCP rder, since R h shares a relatin with R f (the relatin R 16) while R g des nt share a relatin with R f. In this specific case, it can als be shwn that FD(R 4) is nt equivalent t (FD(R f ),FD(R g),fd(r h )). That is, there are relatins R 15,..., R 21 fr which FD(R 4) (FD(R f ),FD(R g),fd(r h )). It is imprtant t emphasize that an SCP rder is nt a necessary cnditin fr equivalence between FD(R) and (FD(R 1),...,FD(R k )). It is nly a sufficient cnditin fr having an equality between the tw expressins. Therem 5.2 implies the fllwing three-step algrithm fr cmputing the full disjunctin f a cnnected set f relatins R. First, find the bicnnected cmpnents f R and btain an SCP rder R 1,..., R k. Secnd, cmpute the full disjunctins FD(R 1),...,FD(R k ). Third, cmpute (FD(R 1),...,FD(R k )) and return it as the answer. This algrithm enumerates, in plynmial ttal time, the full disjunctin f a cnnected set f relatins. In the abve algrithm, the full disjunctins f the bicnnected cmpnents are cmpletely generated befre applying the left-deep uterjins. The size f any relatin FD(R i) can be expnential in the size f the riginal relatins and, therefre, the delay f the whle algrithm is nt plynmial an expnential time is needed fr cmputing intermediate full disjunctins that are the input t NLOJ, and the delay f NLOJ is linear in the size f its input. Next, we shw hw the algrithm presented abve can be mdified t enumerate full disjunctins with plynmial delay. Cnsider a cnnected set R f relatins and let R 1,..., R k be the bicnnected cmpnents f R in an SCP rder. Fr i = 2,..., n, the cnnecting relatin f R i is defined as fllws. If R i cntains a relatin R that appears in R 1 R i 1, then the cnnecting relatin is R (nte that in this case, R is an articulatin pint); therwise, the cnnecting relatin is a relatin that cntains an attribute A that appears in R 1 R i 1. It can be shwn that in bth cases, the cnnecting relatin is unique. We nw describe hw t enumerate the full disjunctin f a set f relatins R, given that R 1,..., R k is a sequence f the bicnnected cmpnents f R in an SCP rder. Accrding t Therem 5.2, it is sufficient t enumerate the expressin (FD(R 1),...,FD(R k )). This is dne in the algrithm BiCmNLOJ f Figure 1. BiCmNLOJ can be seen as an advanced versin f NLOJ. The main difference between BiCmNLOJ and NLOJ is that while NLOJ enumerates the left-deep uterjin f the relatins f the input, BiCmNLOJ enumerates a left-deep uterjin f relatins that are full disjunctins f bicnnected cmpnents. Tw majr changes are necessary due t the difference between the algrithms. First, given a tuple ˆt, we cannt enumerate FD(R k ) with plynmial delay by simply lking fr tuples that can be jined with ˆt, because f the fllwing reasn. The size f FD(R k ) culd be expnential in the size f the input and, hence, the verall delay wuld nt be plynmial. The secnd prblem is that f efficiently enumerating the tuples f FD(R k ) that cannt be jined with any tuple f (FD(R 1),...,FD(R k 1 )). The slutin t the first prblem stems frm the fllwing prperty. Let R be the cnnecting relatin f R k. A tuple 746

9 ˆt (FD(R 1),...,FD(R k 1 )) can be jined with sme tuple t FD(R k ) if and nly if there is a tuple t R, such that (1) t is jin cnsistent with ˆt and (2) t FD t (R k ). Thus, we need t find all the tuples t R that are jin cnsistent with ˆt and enumerate FD t (R k ). This is dne using an iteratr ver TupExtFD (Lines 12 14). We nw address the secnd prblem. There are tw types f tuples f FD(R k ) that cannt be jined with any tuple ˆt (FD(R 1),..., FD(R k 1 )). One type cnsists f the tuples in the relatins FD t (R k ), where t is a tuple f R that is nt jin cnsistent with any tuple ˆt. The secnd type includes all the tuples f FD R (R k ). Fr enumerating tuples f the first type, we mark (in Line 11) the tuples f R that are chsen in Line 1 (thus, we are interested in the unmarked tuples). Fr the secnd type, we use an iteratr ver RelExcFD (Lines 21-23). Nte that RelExcFD is executed nly after TupExtFD is applied t all the tuples f R. If R k is a singletn bicnnected cmpnent {R}, the cmputatin can be dne similarly t the way it was dne in NLOJ, which is simpler and mre efficient. Hwever, I (in Line 7 f Figure 1) still has t be an iteratr f Bi- CmNLOJ rather than f NLOJ, since recursive calls may encunter bicnnected cmpnents that are nt necessarily singletns. Nnetheless, when R is acyclic, the cmputatin becmes identical t that f NLOJ. The cmplexity f BiCmNLOJ is cnsidered in the fllwing therem. We use n i t dente the number f relatins in the bicnnected cmpnent R i and N i t dente the ttal number f tuples in R i. Therem 5.4. Cnsider a cnnected set f relatins R. If R 1,..., R k are the bicnnected cmpnents f R in an SCP rder, then BiCmNLOJ(R 1,..., R k ) enumerates the full disjunctin FD(R). The delay f the enumeratin is k i=1 O( Mi), where Mi = Ni if Ri is a singletn cmpnent and M i = n 2 i N i + Ni 2 + n in i lg N i therwise. Therem 5.4 shws that BiCmNLOJ runs with plynmial delay, thereby implying a ttal time that is linear in the size f the utput. Observe that when k > 1, the delay f BiCmNLOJ is shrter than that f PDelayFD, due t ur strategy f decmpsing the scheme graph int bicnnected cmpnents. In Therem 5.4, we assume that the set R is cnnected. Recall that when R is nt cnnected, the full disjunctin f R is simply an uterunin f the full disjunctins f the cnnected cmpnents f R. We nw return t the prblem f null values in the input relatins. First, we emphasize that all ur algrithms, except fr BiCmNLOJ, wrk crrectly fr relatins that cntain null values. The prblem with BiCmNLOJ is illustrated by the fllwing example. Cnsider tw bicnnected cmpnents R 1 and R 2. Suppse that the intersectin f R 1 and R 2 is the relatin (articulatin pint) R and, mrever, R has an attribute A that appears in n ther relatin f either R 1 r R 2. Als, we assume that R cntains a single tuple t that has a null in A. Let ˆT 1 and ˆT 2 be tuple sets f R 1 and R 2, respectively, such that t is cntained in bth ˆT 1 and ˆT 2. Because ˆT 1 and ˆT 2 are jin cnsistent with the tuple f the shared relatin, they shuld be jined. Hwever, since there is a null in A, the tuples embed S1 ( ˆT 1) and embed S2 ( ˆT 2) (where S 1 and S 2 are the schemes f R 1 and R 2, respectively) are nt jin cnsistent. BiCmNLOJ((R 1,..., R k )) 1: if k = 1 then 2: PDelayFD(R 1) 3: else 4: S := R (R 1 R 2...R k ) sc(r ) 5: let R be the cnnecting relatin f R k 6: Q R := 7: I := new Iteratr(BiCmNLOJ,(R 1,..., R k 1 )) 8: while I.hasNext() d 9: ˆt := I.next() 1: fr all tuples t R, s.t. JCC({t, ˆt}) d 11: mark t 12: I t := new Iteratr (TupExtFD, (R k, R, t, Q R)) 13: while I t.hasnext() d 14: utput embed S({I t.next(), ˆt}) 15: 16: if t R, s.t. JC({t, ˆt}) then utput embed S(ˆt) 17: fr all unmarked tuples t R d 18: I t := new Iteratr(TupExtFD, (R k, R, t, Q R)) 19: while I t.hasnext() d 2: utput (embed S(I t.next())) 21: I R := new Iteratr(RelExcFD,(R k, R, t, Q R)) 22: while I R.hasNext() d 23: utput (embed S(I R.next())) Figure 1: Main algrithm fr cmputing FD(R) There are tw pssible slutins t the prblem f null values in the the input relatins. One slutin is t change the algrithms s that they wrk with tuple sets actual tuples f the full disjunctin are generated just befre printing them. A secnd slutin is t slightly mdify the functins JC() and JCC(), used in lines 1 and 15 f BiCmNLOJ, s that embeddings f tuple sets that have a tuple in cmmn will be cnsidered jin cnsistent even when the cmmn tuple cntains null values. Nte that this requires sme bkkeeping t indicate whether embeddings were generated frm tuple sets that included tuples frm articulatin pints. 6. FULL DISJUNCTIONS AND SQL In this sectin, we discuss in general lines the incrpratin f full disjunctins int query plans fr SQL. By using the algrithms presented in this paper, the full-disjunctin peratr can be incrprated int query plans as a fully pipelined cmpnent. Therefre, in many cases, full disjunctins need nt be stred in temprary relatins. Next, we discuss hw t ptimize queries that invlve full disjunctins as well as additinal peratrs. The query f Example 1.1, fr instance, als invlves a prjectin, a selectin and srting. We start by cnsidering the ORDER BY peratr. Suppse that in a given query, this peratr is applied t nly ne attribute A (e.g., as in the query f Example 1.1). The algrithm PDelayFD can be mdified t enumerate the full disjunctin in a srted rder, withut increasing the delay. This requires starting with the relatins whse schema cntain A. Tuples frm these relatins are then taken rdered accrding t A and extended as being dne in PDelayFD. This can be generalized t enumerating full disjunctins in ranked 747

10 rder accrding t any ranking functin that is mntnically c-determined, as defined in [2]. (Intuitively, a ranking functin is mntnically c-determined if fr each tuple t in the full disjunctin it is pssible t determine the relative ranking f t by lking at a fixed number c f tuples amng the tuples that are jined t generate t.) It fllws that if the ORDER BY peratr is applied t a bunded number f attributes, such that the relatins cntaining these attributes frm a clique in the scheme graph, then the tuples f the full disjunctin can be enumerated with plynmial delay in srted rder. The algrithm Bi- CmNLOJ can als be mdified t enumerate the tuples in srted rder (under the same cnditins described abve). Fr that, all the attributes f the ORDER BY clause shuld be in a single bicnnected cmpnent; hwever, if this is nt the case, then we can cmbine all the bicnnected cmpnents that cntain these attributes and treat them as ne. We leave further investigatin f these t future wrk. As fr the selectin peratr, the situatin is mre cmplicated. It is NP-cmplete t test whether the result f selecting tuples frm a full disjunctin is nt empty, even if the cnditin cnsists f nly tw equalities [12]. Intuitively, if we naively delete tuples that d nt satisfy the cnditin f a selectin prir t cmputing the full disjunctin, we might end up with partial infrmatin that wuld nt be there therwise, i.e., wuld be jined with tuples that d nt satisfy the cnditin f the selectin. This means that, in general, selectins cannt be pushed thrugh a full disjunctin. In the special case that a selectin with a single attribute is applied t a full disjunctin, it is pssible t cmpute the selectin f a full disjunctin, while retaining plynmial delay. Intuitively, this can be achieved by enumerating the result in a special rder. We can push a selectin thugh a full disjunctin by applying a srted enumeratin that terminates after a certain pint. Suppse, fr example, that the selectin cnsists f a cnditin cnd(a) that invlves a single attribute A. We can enumerate the full disjunctin in a srted rder accrding t A, where the rder is defined s that values X satisfying cnd(x) precede all ther values. Enumeratin terminates when the first tuple that des nt satisfy the cnditin is generated. In general, it is nt pssible t cmpute the result f a prjectin 4 f a full disjunctin in plynmial delay, r even plynmial ttal time, unless P NP [12]. Hwever, it is pssible t imprve the naive methd f first cmpletely cmputing a full disjunctin, and then applying a prjectin, by pushing the prjectin thrugh the full disjunctin, similarly t pushing it thrugh a natural jin. In ther wrds, attributes that d nt appear in the query and are nt used fr jining relatins can be prjected ut early. Mrever, if all the prjectin and selectin attributes belng t a single bicnnected cmpnent, it is sufficient t cmpute the full disjunctin f this cmpnent. In general, given a set f prjectin (and selectin) attributes we nly need t cmpute the full disjunctin w.r.t. cmpnents that cntain at least ne relatin that is n a simple path, in the scheme graph, between tw relatins that cntain prjectin r selectin attributes. 7. EXPERIMENTATION 4 We cnsider nly prjectin that discards duplicatins, i.e., when distinct is used. Dealing with prjectin that retains duplicatins is beynd the scpe f this discussin. {B,C,I} {I,J,K} {K,L,M} {A,B,C} {A,D,E} {D,F} {J,K,M} {J,L} (a) G sc(r 2) {E,G} {G,H} {C,D,E} {A,D,E} {I,J,K} {J,K} {E,F,G} {G,H,I} {A,B,C} {J,L} {I,L} (b) G sc(r 3) {B,F} Figure 11: Scheme graphs used in the experiments We implemented ur algrithms fr cmputing full disjunctins in the pen-surce database system PstgreSQL, Versin The experiments were carried ut n a Pentium 4 with a CPU f 1.6GHZ and 512MB f RAM, running the Linux Mandrake mdk perating system. We discuss implementatin details and ur experimentatin. 7.1 Implementatin Details We implemented bth PDelayFD and BiCmNLOJ, and we added basic ptimizatins t this implementatin. Fr example, we strategically discarded sme intermediate results when it was pssible t prve that these results wuld nt yield new tuples. We als tried different strategies fr chsing relatins while extending tuple sets in Extend- TMax. The mst imprtant ptimizatin in the implementatin f BiCmNLOJ is that in the executin f TupleExtFD, the generated tuples are cached n the disk. If TupleExtFD is called mre than nce with the same argument t, it retrieves the answer frm the cache. The presented algrithms are tuple based, hwever, ur implementatins are blck-based versins f these algrithms. Intuitively, this means that whenever pssible, they lp ver blcks f tuples, instead f ver individual tuples. This speeds up f the cmputatin, in several places. Fr example, ExtendTMax ptentially reads the entire cntents f the database. In its blck-based versin, several tuples are extended at nce, thereby reducing the number f reads frm the disk. Our implementatin f PDelayFD is fully blck based. Our implementatin f BiCmNLOJ is nly partially blck based at present, due t the need fr rather cmplex bkkeeping in a fully blck-based versin f this algrithm. In additin t PDelayFD and BiCmNLOJ, we have als implemented (within the PstgreSQL setting) a fully blck-based versin f the algrithm IncrementalFD [2]. This algrithm cmputes full disjunctins in incremental plynmial time and it is the mst efficient algrithm in the literature. 7.2 Experiments and Results We cnducted ur experiments n synthetic, randmly generated databases with predefined schemes. We used data- 748

Homology groups of disks with holes

Homology groups of disks with holes Hmlgy grups f disks with hles THEOREM. Let p 1,, p k } be a sequence f distinct pints in the interir unit disk D n where n 2, and suppse that fr all j the sets E j Int D n are clsed, pairwise disjint subdisks.

More information

Revision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax

Revision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax .7.4: Direct frequency dmain circuit analysis Revisin: August 9, 00 5 E Main Suite D Pullman, WA 9963 (509) 334 6306 ice and Fax Overview n chapter.7., we determined the steadystate respnse f electrical

More information

CHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India

CHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India CHAPTER 3 INEQUALITIES Cpyright -The Institute f Chartered Accuntants f India INEQUALITIES LEARNING OBJECTIVES One f the widely used decisin making prblems, nwadays, is t decide n the ptimal mix f scarce

More information

Turing Machines. Human-aware Robotics. 2017/10/17 & 19 Chapter 3.2 & 3.3 in Sipser Ø Announcement:

Turing Machines. Human-aware Robotics. 2017/10/17 & 19 Chapter 3.2 & 3.3 in Sipser Ø Announcement: Turing Machines Human-aware Rbtics 2017/10/17 & 19 Chapter 3.2 & 3.3 in Sipser Ø Annuncement: q q q q Slides fr this lecture are here: http://www.public.asu.edu/~yzhan442/teaching/cse355/lectures/tm-ii.pdf

More information

Revisiting the Socrates Example

Revisiting the Socrates Example Sectin 1.6 Sectin Summary Valid Arguments Inference Rules fr Prpsitinal Lgic Using Rules f Inference t Build Arguments Rules f Inference fr Quantified Statements Building Arguments fr Quantified Statements

More information

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007 CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is

More information

Chapter 3: Cluster Analysis

Chapter 3: Cluster Analysis Chapter 3: Cluster Analysis } 3.1 Basic Cncepts f Clustering 3.1.1 Cluster Analysis 3.1. Clustering Categries } 3. Partitining Methds 3..1 The principle 3.. K-Means Methd 3..3 K-Medids Methd 3..4 CLARA

More information

Chapter Summary. Mathematical Induction Strong Induction Recursive Definitions Structural Induction Recursive Algorithms

Chapter Summary. Mathematical Induction Strong Induction Recursive Definitions Structural Induction Recursive Algorithms Chapter 5 1 Chapter Summary Mathematical Inductin Strng Inductin Recursive Definitins Structural Inductin Recursive Algrithms Sectin 5.1 3 Sectin Summary Mathematical Inductin Examples f Prf by Mathematical

More information

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) > Btstrap Methd > # Purpse: understand hw btstrap methd wrks > bs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(bs) > mean(bs) [1] 21.64625 > # estimate f lambda > lambda = 1/mean(bs);

More information

Dead-beat controller design

Dead-beat controller design J. Hetthéssy, A. Barta, R. Bars: Dead beat cntrller design Nvember, 4 Dead-beat cntrller design In sampled data cntrl systems the cntrller is realised by an intelligent device, typically by a PLC (Prgrammable

More information

1996 Engineering Systems Design and Analysis Conference, Montpellier, France, July 1-4, 1996, Vol. 7, pp

1996 Engineering Systems Design and Analysis Conference, Montpellier, France, July 1-4, 1996, Vol. 7, pp THE POWER AND LIMIT OF NEURAL NETWORKS T. Y. Lin Department f Mathematics and Cmputer Science San Jse State University San Jse, Califrnia 959-003 tylin@cs.ssu.edu and Bereley Initiative in Sft Cmputing*

More information

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d)

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d) COMP 551 Applied Machine Learning Lecture 9: Supprt Vectr Machines (cnt d) Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Class web page: www.cs.mcgill.ca/~hvanh2/cmp551 Unless therwise

More information

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving.

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving. Sectin 3.2: Many f yu WILL need t watch the crrespnding vides fr this sectin n MyOpenMath! This sectin is primarily fcused n tls t aid us in finding rts/zers/ -intercepts f plynmials. Essentially, ur fcus

More information

READING STATECHART DIAGRAMS

READING STATECHART DIAGRAMS READING STATECHART DIAGRAMS Figure 4.48 A Statechart diagram with events The diagram in Figure 4.48 shws all states that the bject plane can be in during the curse f its life. Furthermre, it shws the pssible

More information

Computational modeling techniques

Computational modeling techniques Cmputatinal mdeling techniques Lecture 4: Mdel checing fr ODE mdels In Petre Department f IT, Åb Aademi http://www.users.ab.fi/ipetre/cmpmd/ Cntent Stichimetric matrix Calculating the mass cnservatin relatins

More information

Activity Guide Loops and Random Numbers

Activity Guide Loops and Random Numbers Unit 3 Lessn 7 Name(s) Perid Date Activity Guide Lps and Randm Numbers CS Cntent Lps are a relatively straightfrward idea in prgramming - yu want a certain chunk f cde t run repeatedly - but it takes a

More information

Differentiation Applications 1: Related Rates

Differentiation Applications 1: Related Rates Differentiatin Applicatins 1: Related Rates 151 Differentiatin Applicatins 1: Related Rates Mdel 1: Sliding Ladder 10 ladder y 10 ladder 10 ladder A 10 ft ladder is leaning against a wall when the bttm

More information

Broadcast Program Generation for Unordered Queries with Data Replication

Broadcast Program Generation for Unordered Queries with Data Replication Bradcast Prgram Generatin fr Unrdered Queries with Data Replicatin Jiun-Lng Huang and Ming-Syan Chen Department f Electrical Engineering Natinal Taiwan University Taipei, Taiwan, ROC E-mail: jlhuang@arbr.ee.ntu.edu.tw,

More information

Name: Block: Date: Science 10: The Great Geyser Experiment A controlled experiment

Name: Block: Date: Science 10: The Great Geyser Experiment A controlled experiment Science 10: The Great Geyser Experiment A cntrlled experiment Yu will prduce a GEYSER by drpping Ments int a bttle f diet pp Sme questins t think abut are: What are yu ging t test? What are yu ging t measure?

More information

5 th grade Common Core Standards

5 th grade Common Core Standards 5 th grade Cmmn Cre Standards In Grade 5, instructinal time shuld fcus n three critical areas: (1) develping fluency with additin and subtractin f fractins, and develping understanding f the multiplicatin

More information

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y )

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y ) (Abut the final) [COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t m a k e s u r e y u a r e r e a d y ) The department writes the final exam s I dn't really knw what's n it and I can't very well

More information

TRAINING GUIDE. Overview of Lucity Spatial

TRAINING GUIDE. Overview of Lucity Spatial TRAINING GUIDE Overview f Lucity Spatial Overview f Lucity Spatial In this sessin, we ll cver the key cmpnents f Lucity Spatial. Table f Cntents Lucity Spatial... 2 Requirements... 2 Supprted Mdules...

More information

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines COMP 551 Applied Machine Learning Lecture 11: Supprt Vectr Machines Instructr: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted fr this curse

More information

Document for ENES5 meeting

Document for ENES5 meeting HARMONISATION OF EXPOSURE SCENARIO SHORT TITLES Dcument fr ENES5 meeting Paper jintly prepared by ECHA Cefic DUCC ESCOM ES Shrt Titles Grup 13 Nvember 2013 OBJECTIVES FOR ENES5 The bjective f this dcument

More information

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards:

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards: MODULE FOUR This mdule addresses functins SC Academic Standards: EA-3.1 Classify a relatinship as being either a functin r nt a functin when given data as a table, set f rdered pairs, r graph. EA-3.2 Use

More information

Five Whys How To Do It Better

Five Whys How To Do It Better Five Whys Definitin. As explained in the previus article, we define rt cause as simply the uncvering f hw the current prblem came int being. Fr a simple causal chain, it is the entire chain. Fr a cmplex

More information

Lead/Lag Compensator Frequency Domain Properties and Design Methods

Lead/Lag Compensator Frequency Domain Properties and Design Methods Lectures 6 and 7 Lead/Lag Cmpensatr Frequency Dmain Prperties and Design Methds Definitin Cnsider the cmpensatr (ie cntrller Fr, it is called a lag cmpensatr s K Fr s, it is called a lead cmpensatr Ntatin

More information

WRITING THE REPORT. Organizing the report. Title Page. Table of Contents

WRITING THE REPORT. Organizing the report. Title Page. Table of Contents WRITING THE REPORT Organizing the reprt Mst reprts shuld be rganized in the fllwing manner. Smetime there is a valid reasn t include extra chapters in within the bdy f the reprt. 1. Title page 2. Executive

More information

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical mdel fr micrarray data analysis David Rssell Department f Bistatistics M.D. Andersn Cancer Center, Hustn, TX 77030, USA rsselldavid@gmail.cm

More information

Kinetic Model Completeness

Kinetic Model Completeness 5.68J/10.652J Spring 2003 Lecture Ntes Tuesday April 15, 2003 Kinetic Mdel Cmpleteness We say a chemical kinetic mdel is cmplete fr a particular reactin cnditin when it cntains all the species and reactins

More information

Dataflow Analysis and Abstract Interpretation

Dataflow Analysis and Abstract Interpretation Dataflw Analysis and Abstract Interpretatin Cmputer Science and Artificial Intelligence Labratry MIT Nvember 9, 2015 Recap Last time we develped frm first principles an algrithm t derive invariants. Key

More information

CONSTRUCTING STATECHART DIAGRAMS

CONSTRUCTING STATECHART DIAGRAMS CONSTRUCTING STATECHART DIAGRAMS The fllwing checklist shws the necessary steps fr cnstructing the statechart diagrams f a class. Subsequently, we will explain the individual steps further. Checklist 4.6

More information

Administrativia. Assignment 1 due thursday 9/23/2004 BEFORE midnight. Midterm exam 10/07/2003 in class. CS 460, Sessions 8-9 1

Administrativia. Assignment 1 due thursday 9/23/2004 BEFORE midnight. Midterm exam 10/07/2003 in class. CS 460, Sessions 8-9 1 Administrativia Assignment 1 due thursday 9/23/2004 BEFORE midnight Midterm eam 10/07/2003 in class CS 460, Sessins 8-9 1 Last time: search strategies Uninfrmed: Use nly infrmatin available in the prblem

More information

NUROP CONGRESS PAPER CHINESE PINYIN TO CHINESE CHARACTER CONVERSION

NUROP CONGRESS PAPER CHINESE PINYIN TO CHINESE CHARACTER CONVERSION NUROP Chinese Pinyin T Chinese Character Cnversin NUROP CONGRESS PAPER CHINESE PINYIN TO CHINESE CHARACTER CONVERSION CHIA LI SHI 1 AND LUA KIM TENG 2 Schl f Cmputing, Natinal University f Singapre 3 Science

More information

Multiple Source Multiple. using Network Coding

Multiple Source Multiple. using Network Coding Multiple Surce Multiple Destinatin Tplgy Inference using Netwrk Cding Pegah Sattari EECS, UC Irvine Jint wrk with Athina Markpulu, at UCI, Christina Fraguli, at EPFL, Lausanne Outline Netwrk Tmgraphy Gal,

More information

ENSC Discrete Time Systems. Project Outline. Semester

ENSC Discrete Time Systems. Project Outline. Semester ENSC 49 - iscrete Time Systems Prject Outline Semester 006-1. Objectives The gal f the prject is t design a channel fading simulatr. Upn successful cmpletin f the prject, yu will reinfrce yur understanding

More information

Hypothesis Tests for One Population Mean

Hypothesis Tests for One Population Mean Hypthesis Tests fr One Ppulatin Mean Chapter 9 Ala Abdelbaki Objective Objective: T estimate the value f ne ppulatin mean Inferential statistics using statistics in rder t estimate parameters We will be

More information

Pattern Recognition 2014 Support Vector Machines

Pattern Recognition 2014 Support Vector Machines Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft

More information

CHAPTER 2 Algebraic Expressions and Fundamental Operations

CHAPTER 2 Algebraic Expressions and Fundamental Operations CHAPTER Algebraic Expressins and Fundamental Operatins OBJECTIVES: 1. Algebraic Expressins. Terms. Degree. Gruping 5. Additin 6. Subtractin 7. Multiplicatin 8. Divisin Algebraic Expressin An algebraic

More information

Building to Transformations on Coordinate Axis Grade 5: Geometry Graph points on the coordinate plane to solve real-world and mathematical problems.

Building to Transformations on Coordinate Axis Grade 5: Geometry Graph points on the coordinate plane to solve real-world and mathematical problems. Building t Transfrmatins n Crdinate Axis Grade 5: Gemetry Graph pints n the crdinate plane t slve real-wrld and mathematical prblems. 5.G.1. Use a pair f perpendicular number lines, called axes, t define

More information

Math Foundations 20 Work Plan

Math Foundations 20 Work Plan Math Fundatins 20 Wrk Plan Units / Tpics 20.8 Demnstrate understanding f systems f linear inequalities in tw variables. Time Frame December 1-3 weeks 6-10 Majr Learning Indicatrs Identify situatins relevant

More information

Department of Electrical Engineering, University of Waterloo. Introduction

Department of Electrical Engineering, University of Waterloo. Introduction Sectin 4: Sequential Circuits Majr Tpics Types f sequential circuits Flip-flps Analysis f clcked sequential circuits Mre and Mealy machines Design f clcked sequential circuits State transitin design methd

More information

Physics 2010 Motion with Constant Acceleration Experiment 1

Physics 2010 Motion with Constant Acceleration Experiment 1 . Physics 00 Mtin with Cnstant Acceleratin Experiment In this lab, we will study the mtin f a glider as it accelerates dwnhill n a tilted air track. The glider is supprted ver the air track by a cushin

More information

Finite Automata. Human-aware Robo.cs. 2017/08/22 Chapter 1.1 in Sipser

Finite Automata. Human-aware Robo.cs. 2017/08/22 Chapter 1.1 in Sipser Finite Autmata 2017/08/22 Chapter 1.1 in Sipser 1 Last time Thery f cmputatin Autmata Thery Cmputability Thery Cmplexity Thery Finite autmata Pushdwn autmata Turing machines 2 Outline fr tday Finite autmata

More information

NOTE ON APPELL POLYNOMIALS

NOTE ON APPELL POLYNOMIALS NOTE ON APPELL POLYNOMIALS I. M. SHEFFER An interesting characterizatin f Appell plynmials by means f a Stieltjes integral has recently been given by Thrne. 1 We prpse t give a secnd such representatin,

More information

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS 1 Influential bservatins are bservatins whse presence in the data can have a distrting effect n the parameter estimates and pssibly the entire analysis,

More information

MODULE 1. e x + c. [You can t separate a demominator, but you can divide a single denominator into each numerator term] a + b a(a + b)+1 = a + b

MODULE 1. e x + c. [You can t separate a demominator, but you can divide a single denominator into each numerator term] a + b a(a + b)+1 = a + b . REVIEW OF SOME BASIC ALGEBRA MODULE () Slving Equatins Yu shuld be able t slve fr x: a + b = c a d + e x + c and get x = e(ba +) b(c a) d(ba +) c Cmmn mistakes and strategies:. a b + c a b + a c, but

More information

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation III-l III. A New Evaluatin Measure J. Jiner and L. Werner Abstract The prblems f evaluatin and the needed criteria f evaluatin measures in the SMART system f infrmatin retrieval are reviewed and discussed.

More information

You need to be able to define the following terms and answer basic questions about them:

You need to be able to define the following terms and answer basic questions about them: CS440/ECE448 Sectin Q Fall 2017 Midterm Review Yu need t be able t define the fllwing terms and answer basic questins abut them: Intr t AI, agents and envirnments Pssible definitins f AI, prs and cns f

More information

1 The limitations of Hartree Fock approximation

1 The limitations of Hartree Fock approximation Chapter: Pst-Hartree Fck Methds - I The limitatins f Hartree Fck apprximatin The n electrn single determinant Hartree Fck wave functin is the variatinal best amng all pssible n electrn single determinants

More information

Subject description processes

Subject description processes Subject representatin 6.1.2. Subject descriptin prcesses Overview Fur majr prcesses r areas f practice fr representing subjects are classificatin, subject catalging, indexing, and abstracting. The prcesses

More information

Preparation work for A2 Mathematics [2017]

Preparation work for A2 Mathematics [2017] Preparatin wrk fr A2 Mathematics [2017] The wrk studied in Y12 after the return frm study leave is frm the Cre 3 mdule f the A2 Mathematics curse. This wrk will nly be reviewed during Year 13, it will

More information

Technical Bulletin. Generation Interconnection Procedures. Revisions to Cluster 4, Phase 1 Study Methodology

Technical Bulletin. Generation Interconnection Procedures. Revisions to Cluster 4, Phase 1 Study Methodology Technical Bulletin Generatin Intercnnectin Prcedures Revisins t Cluster 4, Phase 1 Study Methdlgy Release Date: Octber 20, 2011 (Finalizatin f the Draft Technical Bulletin released n September 19, 2011)

More information

MODULAR DECOMPOSITION OF THE NOR-TSUM MULTIPLE-VALUED PLA

MODULAR DECOMPOSITION OF THE NOR-TSUM MULTIPLE-VALUED PLA MODUAR DECOMPOSITION OF THE NOR-TSUM MUTIPE-AUED PA T. KAGANOA, N. IPNITSKAYA, G. HOOWINSKI k Belarusian State University f Infrmatics and Radielectrnics, abratry f Image Prcessing and Pattern Recgnitin.

More information

NUMBERS, MATHEMATICS AND EQUATIONS

NUMBERS, MATHEMATICS AND EQUATIONS AUSTRALIAN CURRICULUM PHYSICS GETTING STARTED WITH PHYSICS NUMBERS, MATHEMATICS AND EQUATIONS An integral part t the understanding f ur physical wrld is the use f mathematical mdels which can be used t

More information

SPH3U1 Lesson 06 Kinematics

SPH3U1 Lesson 06 Kinematics PROJECTILE MOTION LEARNING GOALS Students will: Describe the mtin f an bject thrwn at arbitrary angles thrugh the air. Describe the hrizntal and vertical mtins f a prjectile. Slve prjectile mtin prblems.

More information

Cambridge Assessment International Education Cambridge Ordinary Level. Published

Cambridge Assessment International Education Cambridge Ordinary Level. Published Cambridge Assessment Internatinal Educatin Cambridge Ordinary Level ADDITIONAL MATHEMATICS 4037/1 Paper 1 Octber/Nvember 017 MARK SCHEME Maximum Mark: 80 Published This mark scheme is published as an aid

More information

ENG2410 Digital Design Sequential Circuits: Part A

ENG2410 Digital Design Sequential Circuits: Part A ENG2410 Digital Design Sequential Circuits: Part A Fall 2017 S. Areibi Schl f Engineering University f Guelph Week #6 Tpics Sequential Circuit Definitins Latches Flip-Flps Delays in Sequential Circuits

More information

Comprehensive Exam Guidelines Department of Chemical and Biomolecular Engineering, Ohio University

Comprehensive Exam Guidelines Department of Chemical and Biomolecular Engineering, Ohio University Cmprehensive Exam Guidelines Department f Chemical and Bimlecular Engineering, Ohi University Purpse In the Cmprehensive Exam, the student prepares an ral and a written research prpsal. The Cmprehensive

More information

Public Key Cryptography. Tim van der Horst & Kent Seamons

Public Key Cryptography. Tim van der Horst & Kent Seamons Public Key Cryptgraphy Tim van der Hrst & Kent Seamns Last Updated: Oct 5, 2017 Asymmetric Encryptin Why Public Key Crypt is Cl Has a linear slutin t the key distributin prblem Symmetric crypt has an expnential

More information

Application Of Mealy Machine And Recurrence Relations In Cryptography

Application Of Mealy Machine And Recurrence Relations In Cryptography Applicatin Of Mealy Machine And Recurrence Relatins In Cryptgraphy P. A. Jytirmie 1, A. Chandra Sekhar 2, S. Uma Devi 3 1 Department f Engineering Mathematics, Andhra University, Visakhapatnam, IDIA 2

More information

A Few Basic Facts About Isothermal Mass Transfer in a Binary Mixture

A Few Basic Facts About Isothermal Mass Transfer in a Binary Mixture Few asic Facts but Isthermal Mass Transfer in a inary Miture David Keffer Department f Chemical Engineering University f Tennessee first begun: pril 22, 2004 last updated: January 13, 2006 dkeffer@utk.edu

More information

MODULE ONE. This module addresses the foundational concepts and skills that support all of the Elementary Algebra academic standards.

MODULE ONE. This module addresses the foundational concepts and skills that support all of the Elementary Algebra academic standards. Mdule Fundatinal Tpics MODULE ONE This mdule addresses the fundatinal cncepts and skills that supprt all f the Elementary Algebra academic standards. SC Academic Elementary Algebra Indicatrs included in

More information

Thermodynamics Partial Outline of Topics

Thermodynamics Partial Outline of Topics Thermdynamics Partial Outline f Tpics I. The secnd law f thermdynamics addresses the issue f spntaneity and invlves a functin called entrpy (S): If a prcess is spntaneus, then Suniverse > 0 (2 nd Law!)

More information

A New Approach to Increase Parallelism for Dependent Loops

A New Approach to Increase Parallelism for Dependent Loops A New Apprach t Increase Parallelism fr Dependent Lps Yeng-Sheng Chen, Department f Electrical Engineering, Natinal Taiwan University, Taipei 06, Taiwan, Tsang-Ming Jiang, Arctic Regin Supercmputing Center,

More information

, which yields. where z1. and z2

, which yields. where z1. and z2 The Gaussian r Nrmal PDF, Page 1 The Gaussian r Nrmal Prbability Density Functin Authr: Jhn M Cimbala, Penn State University Latest revisin: 11 September 13 The Gaussian r Nrmal Prbability Density Functin

More information

Enhancing Performance of MLP/RBF Neural Classifiers via an Multivariate Data Distribution Scheme

Enhancing Performance of MLP/RBF Neural Classifiers via an Multivariate Data Distribution Scheme Enhancing Perfrmance f / Neural Classifiers via an Multivariate Data Distributin Scheme Halis Altun, Gökhan Gelen Nigde University, Electrical and Electrnics Engineering Department Nigde, Turkey haltun@nigde.edu.tr

More information

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction T-61.5060 Algrithmic methds fr data mining Slide set 6: dimensinality reductin reading assignment LRU bk: 11.1 11.3 PCA tutrial in mycurses (ptinal) ptinal: An Elementary Prf f a Therem f Jhnsn and Lindenstrauss,

More information

Sequential Allocation with Minimal Switching

Sequential Allocation with Minimal Switching In Cmputing Science and Statistics 28 (1996), pp. 567 572 Sequential Allcatin with Minimal Switching Quentin F. Stut 1 Janis Hardwick 1 EECS Dept., University f Michigan Statistics Dept., Purdue University

More information

THE LIFE OF AN OBJECT IT SYSTEMS

THE LIFE OF AN OBJECT IT SYSTEMS THE LIFE OF AN OBJECT IT SYSTEMS Persns, bjects, r cncepts frm the real wrld, which we mdel as bjects in the IT system, have "lives". Actually, they have tw lives; the riginal in the real wrld has a life,

More information

A proposition is a statement that can be either true (T) or false (F), (but not both).

A proposition is a statement that can be either true (T) or false (F), (but not both). 400 lecture nte #1 [Ch 2, 3] Lgic and Prfs 1.1 Prpsitins (Prpsitinal Lgic) A prpsitin is a statement that can be either true (T) r false (F), (but nt bth). "The earth is flat." -- F "March has 31 days."

More information

On Topological Structures and. Fuzzy Sets

On Topological Structures and. Fuzzy Sets L - ZHR UNIVERSIT - GZ DENSHIP OF GRDUTE STUDIES & SCIENTIFIC RESERCH On Tplgical Structures and Fuzzy Sets y Nashaat hmed Saleem Raab Supervised by Dr. Mhammed Jamal Iqelan Thesis Submitted in Partial

More information

Pipetting 101 Developed by BSU CityLab

Pipetting 101 Developed by BSU CityLab Discver the Micrbes Within: The Wlbachia Prject Pipetting 101 Develped by BSU CityLab Clr Cmparisns Pipetting Exercise #1 STUDENT OBJECTIVES Students will be able t: Chse the crrect size micrpipette fr

More information

CHM112 Lab Graphing with Excel Grading Rubric

CHM112 Lab Graphing with Excel Grading Rubric Name CHM112 Lab Graphing with Excel Grading Rubric Criteria Pints pssible Pints earned Graphs crrectly pltted and adhere t all guidelines (including descriptive title, prperly frmatted axes, trendline

More information

Experiment #3. Graphing with Excel

Experiment #3. Graphing with Excel Experiment #3. Graphing with Excel Study the "Graphing with Excel" instructins that have been prvided. Additinal help with learning t use Excel can be fund n several web sites, including http://www.ncsu.edu/labwrite/res/gt/gt-

More information

The blessing of dimensionality for kernel methods

The blessing of dimensionality for kernel methods fr kernel methds Building classifiers in high dimensinal space Pierre Dupnt Pierre.Dupnt@ucluvain.be Classifiers define decisin surfaces in sme feature space where the data is either initially represented

More information

Determining the Accuracy of Modal Parameter Estimation Methods

Determining the Accuracy of Modal Parameter Estimation Methods Determining the Accuracy f Mdal Parameter Estimatin Methds by Michael Lee Ph.D., P.E. & Mar Richardsn Ph.D. Structural Measurement Systems Milpitas, CA Abstract The mst cmmn type f mdal testing system

More information

Math Foundations 10 Work Plan

Math Foundations 10 Work Plan Math Fundatins 10 Wrk Plan Units / Tpics 10.1 Demnstrate understanding f factrs f whle numbers by: Prime factrs Greatest Cmmn Factrs (GCF) Least Cmmn Multiple (LCM) Principal square rt Cube rt Time Frame

More information

Floating Point Method for Solving Transportation. Problems with Additional Constraints

Floating Point Method for Solving Transportation. Problems with Additional Constraints Internatinal Mathematical Frum, Vl. 6, 20, n. 40, 983-992 Flating Pint Methd fr Slving Transprtatin Prblems with Additinal Cnstraints P. Pandian and D. Anuradha Department f Mathematics, Schl f Advanced

More information

B. Definition of an exponential

B. Definition of an exponential Expnents and Lgarithms Chapter IV - Expnents and Lgarithms A. Intrductin Starting with additin and defining the ntatins fr subtractin, multiplicatin and divisin, we discvered negative numbers and fractins.

More information

Sections 15.1 to 15.12, 16.1 and 16.2 of the textbook (Robbins-Miller) cover the materials required for this topic.

Sections 15.1 to 15.12, 16.1 and 16.2 of the textbook (Robbins-Miller) cover the materials required for this topic. Tpic : AC Fundamentals, Sinusidal Wavefrm, and Phasrs Sectins 5. t 5., 6. and 6. f the textbk (Rbbins-Miller) cver the materials required fr this tpic.. Wavefrms in electrical systems are current r vltage

More information

Tree Structured Classifier

Tree Structured Classifier Tree Structured Classifier Reference: Classificatin and Regressin Trees by L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stne, Chapman & Hall, 98. A Medical Eample (CART): Predict high risk patients

More information

Module 4: General Formulation of Electric Circuit Theory

Module 4: General Formulation of Electric Circuit Theory Mdule 4: General Frmulatin f Electric Circuit Thery 4. General Frmulatin f Electric Circuit Thery All electrmagnetic phenmena are described at a fundamental level by Maxwell's equatins and the assciated

More information

Chem 163 Section: Team Number: ALE 24. Voltaic Cells and Standard Cell Potentials. (Reference: 21.2 and 21.3 Silberberg 5 th edition)

Chem 163 Section: Team Number: ALE 24. Voltaic Cells and Standard Cell Potentials. (Reference: 21.2 and 21.3 Silberberg 5 th edition) Name Chem 163 Sectin: Team Number: ALE 24. Vltaic Cells and Standard Cell Ptentials (Reference: 21.2 and 21.3 Silberberg 5 th editin) What des a vltmeter reading tell us? The Mdel: Standard Reductin and

More information

We can see from the graph above that the intersection is, i.e., [ ).

We can see from the graph above that the intersection is, i.e., [ ). MTH 111 Cllege Algebra Lecture Ntes July 2, 2014 Functin Arithmetic: With nt t much difficulty, we ntice that inputs f functins are numbers, and utputs f functins are numbers. S whatever we can d with

More information

Admissibility Conditions and Asymptotic Behavior of Strongly Regular Graphs

Admissibility Conditions and Asymptotic Behavior of Strongly Regular Graphs Admissibility Cnditins and Asympttic Behavir f Strngly Regular Graphs VASCO MOÇO MANO Department f Mathematics University f Prt Oprt PORTUGAL vascmcman@gmailcm LUÍS ANTÓNIO DE ALMEIDA VIEIRA Department

More information

The standards are taught in the following sequence.

The standards are taught in the following sequence. B L U E V A L L E Y D I S T R I C T C U R R I C U L U M MATHEMATICS Third Grade In grade 3, instructinal time shuld fcus n fur critical areas: (1) develping understanding f multiplicatin and divisin and

More information

Tutorial 3: Building a spectral library in Skyline

Tutorial 3: Building a spectral library in Skyline SRM Curse 2013 Tutrial 3 Spectral Library Tutrial 3: Building a spectral library in Skyline Spectral libraries fr SRM methd design and fr data analysis can be either directly added t a Skyline dcument

More information

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression 3.3.4 Prstate Cancer Data Example (Cntinued) 3.4 Shrinkage Methds 61 Table 3.3 shws the cefficients frm a number f different selectin and shrinkage methds. They are best-subset selectin using an all-subsets

More information

Lyapunov Stability Stability of Equilibrium Points

Lyapunov Stability Stability of Equilibrium Points Lyapunv Stability Stability f Equilibrium Pints 1. Stability f Equilibrium Pints - Definitins In this sectin we cnsider n-th rder nnlinear time varying cntinuus time (C) systems f the frm x = f ( t, x),

More information

A Correlation of. to the. South Carolina Academic Standards for Mathematics Precalculus

A Correlation of. to the. South Carolina Academic Standards for Mathematics Precalculus A Crrelatin f Suth Carlina Academic Standards fr Mathematics Precalculus INTRODUCTION This dcument demnstrates hw Precalculus (Blitzer), 4 th Editin 010, meets the indicatrs f the. Crrelatin page references

More information

Surface and Contact Stress

Surface and Contact Stress Surface and Cntact Stress The cncept f the frce is fundamental t mechanics and many imprtant prblems can be cast in terms f frces nly, fr example the prblems cnsidered in Chapter. Hwever, mre sphisticated

More information

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551

More information

2.161 Signal Processing: Continuous and Discrete Fall 2008

2.161 Signal Processing: Continuous and Discrete Fall 2008 MIT OpenCurseWare http://cw.mit.edu 2.161 Signal Prcessing: Cntinuus and Discrete Fall 2008 Fr infrmatin abut citing these materials r ur Terms f Use, visit: http://cw.mit.edu/terms. Massachusetts Institute

More information

Lecture 17: Free Energy of Multi-phase Solutions at Equilibrium

Lecture 17: Free Energy of Multi-phase Solutions at Equilibrium Lecture 17: 11.07.05 Free Energy f Multi-phase Slutins at Equilibrium Tday: LAST TIME...2 FREE ENERGY DIAGRAMS OF MULTI-PHASE SOLUTIONS 1...3 The cmmn tangent cnstructin and the lever rule...3 Practical

More information

Engineering Decision Methods

Engineering Decision Methods GSOE9210 vicj@cse.unsw.edu.au www.cse.unsw.edu.au/~gs9210 Maximin and minimax regret 1 2 Indifference; equal preference 3 Graphing decisin prblems 4 Dminance The Maximin principle Maximin and minimax Regret

More information

DEFENSE OCCUPATIONAL AND ENVIRONMENTAL HEALTH READINESS SYSTEM (DOEHRS) ENVIRONMENTAL HEALTH SAMPLING ELECTRONIC DATA DELIVERABLE (EDD) GUIDE

DEFENSE OCCUPATIONAL AND ENVIRONMENTAL HEALTH READINESS SYSTEM (DOEHRS) ENVIRONMENTAL HEALTH SAMPLING ELECTRONIC DATA DELIVERABLE (EDD) GUIDE DEFENSE OCCUPATIOL AND ENVIRONMENTAL HEALTH READINESS SYSTEM (DOEHRS) ENVIRONMENTAL HEALTH SAMPLING ELECTRONIC DATA DELIVERABLE (EDD) GUIDE 20 JUNE 2017 V1.0 i TABLE OF CONTENTS 1 INTRODUCTION... 1 2 CONCEPT

More information

MAKING DOUGHNUTS OF COHEN REALS

MAKING DOUGHNUTS OF COHEN REALS MAKING DUGHNUTS F CHEN REALS Lrenz Halbeisen Department f Pure Mathematics Queen s University Belfast Belfast BT7 1NN, Nrthern Ireland Email: halbeis@qub.ac.uk Abstract Fr a b ω with b \ a infinite, the

More information

Trigonometric Ratios Unit 5 Tentative TEST date

Trigonometric Ratios Unit 5 Tentative TEST date 1 U n i t 5 11U Date: Name: Trignmetric Ratis Unit 5 Tentative TEST date Big idea/learning Gals In this unit yu will extend yur knwledge f SOH CAH TOA t wrk with btuse and reflex angles. This extensin

More information