arxiv: v1 [cs.db] 25 Jun 2013

Size: px

Start display at page:

Download "arxiv: v1 [cs.db] 25 Jun 2013"

Shanna Gaines
5 years ago
Views:

1 Communication Steps for Parae Query Processing Pau Beame, Paraschos Koutris and Dan Suciu University of Washington arxiv: v1 [cs.db] 25 Jun 2013 June 26, 2013 Abstract We consider the probem of computing a reationa query q on a arge input database of size n, using a arge number p of servers. The computation is performed in rounds, and each server can receive ony O(n/p 1 ε ) bits of data, where ε [0, 1] is a parameter that contros repication. We examine how many goba communication steps are needed to compute q. We estabish both ower and upper bounds, in two settings. For a singe round of communication, we give ower bounds in the strongest possibe mode, where arbitrary bits may be exchanged; we show that any agorithm requires ε 1 1/τ, where τ is the fractiona vertex cover of the hypergraph of q. We aso give an agorithm that matches the ower bound for a specific cass of databases. For mutipe rounds of communication, we present ower bounds in a mode where routing decisions for a tupe are tupe-based. We show that for the cass of tree-ike queries there exists a tradeoff between the number of rounds and the space exponent ε. The ower bounds for mutipe rounds are the first of their kind. Our resuts aso impy that transitive cosure cannot be computed in O(1) rounds of communication. 1 Introduction Most of the time spent in big data anaysis today is aocated in data processing tasks, such as identifying reevant data, ceaning, fitering, joining, grouping, transforming, extracting features, and evauating resuts [5, 8]. These tasks form the main botteneck in big data anaysis, and a major chaenge for the database community is improving the performance and usabiity of data processing toos. The motivation for this paper comes from the need to understand the compexity of query processing in big data management. Query processing is typicay performed on a shared-nothing parae architecture. In this setting, the data is stored on a arge number of independent servers interconnected by a fast network. The servers perform oca computations, then exchange data in goba data shuffing steps. This mode of computation has been popuarized by MapReduce [7] and Hadoop [15], and can be found in most big data processing systems, ike PigLatin [21], Hive [23], Dremme [19]. Unike traditiona query processing, the compexity is no onger dominated by the number of disk accesses. Typicay, a query is evauated by a sufficienty arge number of servers such that the entire data can be kept in the main memory of these servers. The new compexity botteneck is the communication. Typica network speeds in arge custers are 1Gb/s, which is significanty ower 1

2 than main memory access. In addition, any data reshuffing requires a goba synchronization of a servers, which aso comes at significant cost; for exampe, everyone needs to wait for the sowest server, and, worse, in the case of a stragger, or a oca node faiure, everyone must wait for the fu recovery. Thus, the dominating compexity parameters in big data query processing are the number of communication steps, and the amount of data being exchanged. MapReduce-reated modes Severa computation modes have been proposed in order to understand the power of MapReduce and reated massivey parae programming methods [9, 16, 17, 1]. These a identify the number of communication steps/rounds as a main compexity parameter, but differ in their treatment of the communication. The first of these modes was the MUD (Massive, Unordered, Distributed) mode of Fedman et a. [9]. It takes as input a sequence of eements and appies a binary merge operation repeatedy, unti obtaining a fina resut, simiary to a User Defined Aggregate in database systems. The paper compares MUD with streaming agorithms: a streaming agorithm can triviay simuate MUD, and the converse is aso possibe if the merge operators are computationay powerfu (beyond PTIME). Karoff et a. [16] define MRC, a cass of muti-round agorithms based on using the MapReduce primitive as the soe buiding bock, and fixing specific parameters for baanced processing. The number of processors p is Θ(N 1 ɛ ), and each can exchange MapReduce outputs expressibe in Θ(N 1 ɛ ) bits per step, resuting in Θ(N 2 2ɛ ) tota storage among the processors on a probem of size N. Their focus was agorithmic, showing simuations of other parae modes by MRC, as we as the power of two round agorithms for specific probems. Lower bounds for the singe round MapReduce mode are first discussed by Afrati et a. [1], who derive an interesting tradeoff between reducer size and repication rate. This is nicey iustrated by Uman s drug interaction exampe [25]. There are n (= 6, 500) drugs, each consisting of about 1MB of data about patients who took that drug, and one has to find a drug interactions, by appying a user defined function (UDF) to a pairs of drugs. To see the tradeoffs, it heps to simpify the exampe, by assuming we are given two sets, each of size n, and we have to appy a UDF to every pair of items, one from each set, in effect computing their cartesian product. There are two extreme ways to sove this. One can use n 2 reducers, one for each pair of items; whie each reducer has size 2, this approach is impractica because the entire data is repicated n times. At the other extreme one can use a singe reducer that handes the entire data; the repication rate is 1, but the size of the reducer is 2n, which is aso impractica. As a tradeoff, partition each set into g groups of size n/g, and use one reducer for each of the g 2 pairs of groups: the size of a reducer is 2n/g, whie the repication rate is g. Thus, there is a tradeoff between the repication rate and the reducer size, which was aso shown to hod for severa other casses of probems [1]. Towards ower bound modes There are two significant imitations of this prior work: (1) As powerfu and as convenient as the MapReduce framework is, the operations it provides may not be abe to take fu advantage of the resource constraints of modern systems. The ower bounds say nothing about aternative ways of structuring the computation that send and receive the same amount data per step. (2) Even within the MapReduce framework, the ony ower bounds appy to a singe communication round, and say nothing about the imitations of muti-round MapReduce 2

3 agorithms. Whie it is convenient that MapReduce hides the number of servers from the programmer, when considering the most efficient way to use resources to sove probems it is natura to expose information about those resources to the programmer. In this paper, we take the view that the number of servers p shoud be an expicit parameter of the mode, which aows us to focus on the tradeoff between the amount of communication and the number of rounds. For exampe, going back to our cartesian product probem, if the number of servers p is known, there is one optima way to sove the probem: partition each of the two sets into g = p groups, and et each server hande one pair of groups. A mode with p as expicit parameter was proposed by Koutris and Suciu [17], who showed both ower and upper bounds for one round of communication. In this mode ony tupes are sent and they must be routed independent of each other. For exampe, [17] proves that muti-joins on the same attribute can be computed in one round, whie muti-joins on different attributes, ike R(x), S(x, y), T(y) require stricty more than one round. The study was mosty focused on understanding data skew, the mode was imited, and the resuts do not appy to more than one round. In this paper we deveop more genera modes, estabish ower bounds that hod even in the absence of skew, and use a bit mode, rather than a tupe mode, to represent data. Our ower bound modes and resuts We define the Massivey Parae Communication (MPC) mode, to anayze the tradeoff between the number of rounds and the amount of communication required in a massivey parae computing environment. We incude the number of servers p as a parameter, and aow each server to be infinitey powerfu, subject ony to the data to which it has access. The mode requires that each server receives ony O(N/p 1 ε ) bits of data at any step, where N is the probem size, and ε [0, 1] is a parameter of the mode. This impies that the repication factor is O(p ε ) per round. A particuary natura case is ε = 0, which corresponds to a repication factor of O(1), or O(N/p) bits per server; ε = 1 is degenerate, since it aows the entire data to be sent to every server. We estabish both ower and upper bounds for computing a fu conjunctive query q, in two settings. First, we restrict the computation to a singe communication round and examine the minimum parameter ε for which it is possibe to compute q with O(N/p 1 ε ) bits per processor; we ca this the space exponent. We show that the space exponent for connected queries is aways at east 1 1/τ (q), where τ (q) is the fractiona (vertex) covering number of the hypergraph associated with q [6], which is the optima vaue of the vertex cover inear program (LP) for that hypergraph. This ower bound appies to the strongest possibe mode in which servers can encode any information in their messages, and have access to a common source of randomness. This is stronger than the ower bounds in [1, 17], which assume that the units being exchanged are tupes. Our one round ower bound hods even in the specia case of matching databases, when a attributes are from the same domain [n] and a input reations are (hypergraph) matchings, in other words, every reation has exacty n tupes, and every attribute contains every vaue 1, 2,..., n exacty once. Thus, the ower bound hods even in a case in which there is no data skew. We describe a simpe tupe-independent agorithm that is easiy impementabe in the MapReduce framework, which, in the specia case of matching databases, matches our ower bound for any conjunctive 3

4 query. The agorithm uses the optima soution for the fractiona vertex cover to find an optima spit of the input data to the servers. For exampe, the inear query L 2 = S 1 (x, y), S 2 (y, z) has an optima vertex cover 0, 1, 0 (for the variabes x, y, z), hence its space exponent is ε = 0, whereas the cyce query C 3 = S 1 (x, y), S 2 (y, z), S 3 (z, x) has optima vertex cover 1/2, 1/2, 1/2 and space exponent ε = 1/3. We note that recent work [13, 4, 20] gives upper bounds on the query size in terms of a fractiona edge cover, whie our resuts are in terms of the vertex cover. Thus, our first resut is: Theorem 1.1. For every connected conjunctive query q, any p-processor randomized MPC agorithm computing q in one round requires space exponent ε 1 1/τ (q). This ower bound hods even over matching databases, for which it is optima. Second, we estabish ower bounds for mutipe communication steps, for a restricted version of the MPC mode, caed tupe-based MPC mode. The messages sent in the first round are sti unrestricted, but in subsequent rounds the servers can send ony tupes, either base tupes in the input tabes, or join tupes corresponding to a subquery; moreover, the destinations of each tupe may depend ony on the tupe content, the message received in the first round, the server, and the round. We note that any muti-step MapReduce program is tupe-based, because in any map function the key of the intermediate vaue depends ony on the input tupe to the map function. Here, we prove that the number of rounds required is, essentiay, given by the depth of a query pan for the query, where each operator is a subquery that can be computed in one round for the given ε. For exampe, to compute a ength k chain query L k, if ε = 0, the optima computation is a bushy join tree, where each operator is L 2 (a two-way join) and the optima number of rounds is og 2 k. If ε = 1/2, then we can use L 4 as operator (a four-way join), and the optima number of rounds is og 4 k. More generay, we can show neary matching upper and ower bounds based on graph-theoretic properties of the query such as the foowing: Theorem 1.2. For space exponent ε, the number of rounds required for any tupe-based MPC agorithm to compute any tree-ike conjunctive query q is at east og kε (diam(q)) where k ε = 2 1/(1 ε) and diam(q) is the diameter of q. Moreover, for any connected conjunctive query q, this ower bound is neary matched (up to a difference of essentiay one round) by a tupe-based MPC agorithm with space exponent ε. We further show that our resuts for conjunctive path queries impy that any tupe-based MPC agorithm with space exponent ε < 1 requires Ω(og p) rounds to compute the transitive cosure or connected components of sparse undirected graphs. This is an interesting contrast to the resuts of [16], which show that connected components (and indeed minimum spanning trees) of undirected graphs can be computed in ony two rounds of MapReduce provided that the input graph is sufficienty dense. These are the first ower bounds that appy to mutipe rounds of MapReduce. Both ower bounds in Theorem 1.1 and Theorem 1.2 are stated in a strong form: we show that any agorithm on the MPC mode retrieves ony a 1/p Ω(1) fraction of the answers to the query in expectation, when the inputs are drawn uniformy at random (the exponent depends on the query and on ε); Yao s Lemma [26] immediatey impies a ower bound for any randomized agorithm over worstcase inputs. Notice that the fraction of answers gets worse as the number of servers p increases. 4

5 In other words, the more paraeism we want, the worse an agorithm performs, if the number of communication rounds is bounded. Reated work in communication compexity The resuts we show beong to the study of communication compexity, for which there is a very arge body of existing research [18]. Communication compexity considers the number of bits that need to be communicated between cooperating agents in order to sove computationa probems when the agents have unimited computationa power. Our mode is reated to the so-caed number-in-hand muti-party communication compexity, in which there are mutipe agents and no shared information at the start of communication. This has aready been shown to be important to understanding the processing of massive data: Anaysis of number-in-hand (NIH) communication compexity has been the main method for obtaining ower bounds on the space required for data stream agorithms (e.g. [3]). However, there is something very different about the resuts that we prove here. In amost a prior ower bounds, there is at east one agent that has access to a communication between agents 1. (Typicay, this is either via a shared backboard to which a agents have access or a referee who receives a communication.) In this case, no probem on N bits whose answer is M bits ong can be shown to require more than N + M bits of communication. In our MPC mode, a communication between servers is private and we restrict the communication per processor per step, rather than the tota communication. Indeed, the privacy of communication is essentia to our ower bounds, since we prove ower bounds that appy when the tota communication is much arger than N + M. (Our ower bounds for some probems appy when the tota communication is as arge as N 1+δ.) 2 Preiminaries 2.1 Massivey Parae Communication We fix a parameter ε [0, 1], caed the space exponent, and define the MPC(ε) mode as foows. The computation is performed by p servers, caed workers, connected by a compete network of private channes. The input data has size N bits, and is initiay distributed eveny among the p workers. The computation proceeds in rounds, where each round consists of oca computation at the workers intereaved with goba communication. The compexity is measured in the number of communication rounds. The servers have unimited computationa power, but there is one important restriction: at each round, a worker may receive a tota of ony O(N/p 1 ε ) bits of data from a other workers combined. Our goa is to find ower and upper bounds on the number of communication rounds. 1 Though private-messages modes have been defined before, we are aware of ony two ines of work where ower bounds make use of the fact that no singe agent has access to a communication: (1) Resuts of [11, 14] use the assumption that communication is both private and (muti-pass) one-way, but unike the bounds we prove here, their ower bounds are smaer than the tota input size; (2) Tiwari [24] defined a distributed mode of communication compexity in networks in which in input is given to two processors that communicate privatey using other heper processors. However, this mode is equivaent to ordinary pubic two-party communication when the network aows direct private communication between any two processors, as our mode does. 5

6 The space exponent represents the degree of repication during communication; in each round, the tota amount of data exchanged is O(p ε ) times the size of the input data. When ε = 0, there is no repication, and we ca this the basic MPC mode. The case ε = 1 is degenerate because each server can receive the entire data, and any probem can be soved in a singe round. Simiary, for any fixed ε, if we aow the computation to run for Θ(p 1 ε ) rounds, the entire data can be sent to every server and the mode is again degenerate. We denote Muv r the message sent by server u to server v during round r and denote Mv r = (M r 1 v, (M r 1v,..., Mr pv)) the concatenation of a messages sent to v up to round r. Assuming O(1) rounds, each message M r v hods O(N/p 1 ε ) bits. For our muti-round ower bounds in Section 4, we wi further restrict what the workers can encode in the messages M r uv during rounds r Randomization The MPC mode aows randomization. The random bits are avaiabe to a servers, and are computed independenty of the input data. The agorithm may fai to produce its output with a sma probabiity η > 0, independent of the input. For exampe, we use randomization for oad baancing, and abort the computation if the amount of data received during a communication woud exceed the O(N/p 1 ε ) imit, but this wi ony happen with exponentiay sma probabiity. To prove ower bounds for randomized agorithms, we use Yao s Lemma [26]. We first prove bounds for deterministic agorithms, showing that any agorithm fais with probabiity at east η over inputs chosen randomy from a distribution µ. This impies, by Yao s Lemma, that every randomized agorithm with the same resource bounds wi fai on some input (in the support of µ) with probabiity at east η over the agorithm s random choices. 2.3 Conjunctive Queries In this paper we consider a particuar cass of probems for the MPC mode, namey computing answers to conjunctive queries over an input database. We fix an input vocabuary S 1,..., S, where each reation S j has a fixed arity a j ; we denote a = a j. The input data consists of one reation instance for each symbo. We denote n the argest number of tupes in any reation S j ; then, the entire database instance can be encoded using N = O(n og n) bits, because = O(1) and a j = O(1) for j = 1,...,. We consider fu conjunctive queries (CQs) without sef-joins, denoted as foows: q(x 1,..., x k ) = S 1 ( x 1 ),..., S ( x ) (1) The query is fu, meaning that every variabe in the body appears the head (for exampe q(x) = S(x, y) is not fu), and without sef-joins, meaning that each reation name S j appears ony once (for exampe q(x, y, z) = S(x, y), S(y, z) has a sef-join). The hypergraph of a query q is defined by introducing one node for each variabe in the body and one hyperedge for each set of variabes that occur in a singe atom. We say that a conjunctive query is connected if the query hypergraph is connected (for exampe, q(x, y) = R(x), S(y) is not connected). We use vars(s j ) to denote the set of variabes in the atom S j, and atoms(x i ) to denote the set of atoms where x i occurs; k and denote the number of variabes and atoms in q, as in (1). The connected components of q are the maxima connected subqueries of q. Tabe 1 iustrates exampe queries used throughout this paper. 6

7 We consider two query evauation probems. In JOIN-REPORTING, we require that a tupes in the reation defined by q be produced. In JOIN-WITNESS, we require the production of at east one tupe in the reation defined by q, if one exists; JOIN-WITNESS is the verified version of the natura decision probem JOIN-NONEMPTINESS. Characteristic of a Query The characteristic of a conjunctive query q as in (1) is defined as χ(q) = k + j a j c, where k is the number of variabes, is the number of atoms, a j is the arity of atom S j, and c is the number of connected components of q. For a query q and a set of atoms M atoms(q), define q/m to be the query that resuts from contracting the edges in the hypergraph of q. As an exampe, for the query L 5 in Tabe 1, L 5 /{S 2, S 4 } = S 1 (x 0, x 1 ), S 3 (x 1, x 3 ), S 5 (x 3, x 5 ). Lemma 2.1. The characteristic of a query q satisfies the foowing properties: (a) If q 1,..., q c are the connected components of q, then χ(q) = c i=1 χ(q i). (b) For any M atoms(q), χ(q/m) = χ(q) χ(m). (c) χ(q) 0. (d) For any M atoms(q), χ(q) χ(q/m). Proof. Property (a) is immediate from the definition of χ, since the connected components of q are disjoint with respect to variabes and atoms. Since q/m can be produced by contracting according to each connected component of M in turn, by property (a) and induction it suffices to show that property (b) hods in the case that M is connected. If a connected M has k M variabes, M atoms, and tota arity a M, then the query after contraction, q/m, wi have the same number of connected components, k M 1 fewer variabes, and the terms for the number of atoms and tota arity wi be reduced by M a M for a tota reduction of k M + M a M 1 = χ(m). Thus, property (b) foows. By property (a), it suffices to prove (c) when q is connected. If q is a singe atom then χ(q) 0, since the number of variabes is at most the arity of the atom in q. We reduce to this case by repeatedy contracting the atoms of q unti ony one remains and showing that χ(q) χ(q/s j ): Let m a j be the number of distinct variabes in atom S j. Then, χ(q/s j ) = ( 1) + (k m + 1) (a a j ) 1 = χ(q) + (a j m) χ(q). Property (d) aso foows inductivey from χ(q) χ(q/s j ) or by the combination of property (b) and property (c) appied to M. Finay, et us ca a query q tree-ike if q is connected and χ(q) = 0. For exampe, the query L k is tree-ike, and so is any query over a binary vocabuary whose graph is a tree. Over non-binary vocabuaries, any tree-ike query is acycic, but the converse does not hod: q = S 1 (x 0, x 1, x 2 ), S 2 (x 1, x 2, x 3 ) is acycic but not tree-ike. An important property of tree-ike queries is that every connected subquery wi be aso tree-ike. Vertex Cover and Edge Packing A fractiona vertex cover of a query q is any feasibe soution of the LP shown on the eft of Fig. 1. The vertex cover associates a non-negative number u i to each variabe x i s.t. every atom S j is covered, i:xi vars(s j ) v i 1. The dua LP corresponds to 7

8 a fractiona edge packing probem (aso known as a fractiona matching probem), which associates non-negative numbers u j to each atom S j. The two LPs have the same optima vaue of the objective function, known as the fractiona covering number [6] of the hypergraph associated with q and denoted by τ (q). Thus, τ (q) = min i v i = max j u j. Additionay, if a inequaities are satisfied as equaities by a soution to the LP, we say that the soution is tight. Exampe 2.2. For a simpe exampe, a fractiona vertex cover of the query 2 L 3 = S 1 (x 1, x 2 ), S 2 (x 2, x 3 ), S 3 (x 3, x 4 ) is any soution to v 1 + v 2 1, v 2 + v 3 1 and v 3 + v 4 1; the optima is achieved by (v 1, v 2, v 3, v 4 ) = (0, 1, 1, 0), which is not tight. An edge packing is a soution to u 1 1, u 1 + u 2 1, u 2 + u 3 1 and u 3 1, and the optima is achieved by (1, 0, 1), which is tight. The fractiona edge packing shoud not be confused with the fractiona edge cover, which has been used recenty in severa papers to prove bounds on query size and the running time of a sequentia agorithm for the query [4, 20]; for the resuts in this paper we need the fractiona packing. The two notions coincide, however, when they are tight. 2.4 Input Servers We assume that, at the beginning of the agorithm, each reation S j is stored on a separate server, caed an input server, which during the first round sends a message M 1 ju to every worker u. After the first round, the input servers are no onger used in the computation. A ower bounds in this paper assume that the reations S j are given on separate input servers. A upper bounds hod for either mode. The ower bounds for the mode with separate input servers carry over immediatey to the standard MPC mode, because any agorithm in the standard mode can be simuated in the mode with separate input servers. Indeed, the agorithm must compute the output correcty for any initia distribution of the input data on the p servers: we simpy choose to distribute the input reations S 1,..., S such that the first p/ servers receive S 1, the next p/ servers receive S 2, etc., then simuate the agorithm in the mode with separate input servers (see [17, proof of Proposition 3.5] for a detaied discussion). Thus, it suffices to prove our ower bounds assuming that each input reation is stored on a separate input server. In fact, this mode is even more powerfu, because an input server has now access to the entire reation S j, and can therefore perform some 2 We drop the head variabes when cear from the context. Vertex Covering LP j [] : i [k] : v i 0 minimize k i=1 v i v i 1 (2) i:x i vars(s j ) Edge Packing LP i [k] : j [] : u j 0 maximize u j u j 1 (3) j:x i vars(s j ) Figure 1: The vertex covering LP of the hypergraph of a query q, and its dua edge packing LP. 8

9 Conjunctive Query Expected Minimum Share Vaue Space answer size Vertex Cover Exponents τ (q) Exponent C k (x 1,..., x k ) = k S j (x j, x (j mod k)+1 ) 1 1 2,..., k,..., 1 k k/2 1 2/k T k (z, x 1,..., x k ) = k S j (z, x j ) n 1, 0,..., 0 1, 0,..., L k (x 0, x 1,..., x k ) = k 1 S j (x j 1, x j ) n 0, 1, 0, 1,... 0, k/2, 0, 1,... k/2 1 1/ k/2 k/2 B k,m (x 1,..., x k ) = I [k], I =m S I ( x I ) n k (m 1)( k m ) 1 m,..., 1 m 1 k,..., 1 k k/m 1 m/k Tabe 1: Running exampes in this paper: C k = cyce query, L k = inear query, T k = star query, and B k,m = query with ( k m ) reations, where each reation contains a distinct set of m out of the k head variabes. Assuming the inputs are random permutation, the answer sizes represent exact vaues for L k, T k, and expected vaues for C k, B k,m. goba computation on S j, for exampe compute statistics, find outiers, etc., which are common in practice. 2.5 Input Distribution We find it usefu to consider input databases of the foowing form that we ca a matching database: The domain of the input database wi be [n], for n > 0. In such a database each reation S j is an a j -dimensiona matching, where a j is its arity. In other words, S j has exacty n tupes and each of its coumns contains exacty the vaues 1, 2,..., n; each attribute of S j is a key. For exampe, if S j is binary, then an instance of S j is a permutation on [n]; if S j is ternary then an instance consists of n node-disjoint trianges. Moreover, the answer to a connected conjunctive query q on a matching database is a tabe where each attribute is a key, because we have assumed that q is fu; in particuar, the output to q has at most n tupes. In our ower bounds we assume that a matching database is randomy chosen with uniform probabiity, for a fixed n. Matching databases are database instances without skew. By stating our ower bounds on matching databases we make them even stronger, because they impy that a query cannot be computed even in the absence of skew; of course, the ower bounds aso hod for arbitrary instances. Our upper bounds, however, hod ony on matching databases. Data skew is a known probem in parae processing, and requires dedicated techniques. Lower and upper bounds accounting for the presence of skew are discussed in [17]. 2.6 Friedgut s Inequaity Friedgut [10] introduces the foowing cass of inequaities. Each inequaity is described by a hypergraph, which in our paper corresponds to a query, so we wi describe the inequaity using query terminoogy. Fix a query q as in (1), and et n > 0. For every atom S j ( x j ) of arity a j, we introduce a set of n a j variabes w j (a j ) 0, where a j [n] a j. If a [n] a, we denote by a j the vector of size a j that resuts from projecting on the variabes of the reation S j. Let u = (u 1,..., u ) be a 9

10 fractiona edge cover for q. Then: a [n] k w j (a j ) We iustrate Friedgut s inequaity on C 3 and L 3 : a j [n] a j C 3 (x, y, z) = S 1 (x, y), S 2 (y, z), S 3 (z, x) j w j (a j ) 1/u j u (4) L 3 (x, y, z, w) = S 1 (x, y), S 2 (y, z), S 3 (z, w) (5) C 3 has cover (1/2, 1/2, 1/2), and L 3 has cover (1, 0, 1). Thus, we obtain the foowing inequaities, where α, β, γ stand for w 1, w 2, w 3 respectivey: α xy β yz γ zx x,y,z [n] x,y,z,w [n] α xy β yz γ zw x,y [n] x,y [n] α 2 xy β 2 yz γzx 2 y,z [n] z,x [n] α xy max y,z [n] β yz γ zw z,w [n] where we used the fact that im u 0 ( β 1 u yz ) u = max β yz. Friedgut s inequaities immediatey impy a we known resut deveoped in a series of papers [13, 4, 20] that gives an upper bound on the size of a query answer as a function on the cardinaity of the reations. For exampe in the case of C 3, consider an instance S 1, S 2, S 3, and set α xy = 1 if (x, y) S 1, otherwise α xy = 0 (and simiary for β yz, γ zx ). We obtain then C 3 S1 S 2 S 3. Note that a these resuts are expressed in terms of a fractiona edge cover. When we appy Friedgut s inequaity in Section 3.2 to a fractiona edge packing, we ensure that the packing is tight. 3 One Communication Step Let the space exponent of a query q be the smaest ε 0 for which q can be computed using one communication step in the MPC(ε) mode. In this section, we prove Theorem 1.1, which gives both a genera ower bound on the space exponent for evauating connected conjunctive queries and a precise characterization of the space exponent for evauating them them over matching databases. The proof consists of two parts: we show the optima agorithm in 3.1, and then present the matching ower bound in 3.2. We wi assume w..o.g. throughout this section that the queries do not contain any unary reations. Indeed, by definition, the ony unary matching reation is the set {1, 2,..., n}, and hence it is triviay known to a servers, so we can simpy remove a unary reations before evauating the query. 3.1 An Agorithm for One Round We describe here an agorithm, which we ca HYPERCUBE (HC), that computes a conjunctive query in one step. It uses ideas that can be traced back to Ganguy [12] for parae processing of 10

11 Dataog programs, and were aso used by Afrati and Uman [2] to optimize joins in MapReduce, and by Suri and Vassivitskii [22] to count trianges. Let q be a query as in (1). Associate to each variabe x i a rea vaue e i 0, caed the share exponent of x i, such that k i=1 e i = 1. If p is the number of servers, define p i = p e i: these vaues are caed shares [2]. We assume that the shares are integers. Thus, p = k i=1 p i, and each server can be uniquey identified with a point in the k-dimensiona hypercube [p 1 ] [p k ]. The agorithm uses k independenty chosen random hash functions h i : [n] [p i ], one for each variabe x i. During the communication step, the agorithm sends every tupe S j (a j ) = S j (α i1,..., α iaj ) to a servers y [p 1 ] [p k ] such that h im (α im ) = y im for any 1 m a j. In other words, the tupe S j (a j ) knows the server number aong the dimensions i 1,..., i aj, but does not know the server number aong the other dimensions, and there it needs to be repicated. After receiving the data, each server outputs a query answers derivabe from the received data. The agorithm finds a answers, because each potentia output tupe (α 1,..., α k ) is known by the server y = (h 1 (α 1 ),..., h k (α k )). Exampe 3.1. We iustrate how to compute C 3 (x 1, x 2, x 3 ) = S 1 (x 1, x 2 ), S 2 (x 2, x 3 ), S 3 (x 3, x 1 ). Consider the share exponents e 1 = e 2 = e 3 = 1/3. Each of the p servers is uniquey identified by a tripe (y 1, y 2, y 3 ), where y 1, y 2, y 3 [p 1/3 ]. In the first communication round, the input server storing S 1 sends each tupe S 1 (α 1, α 2 ) to a servers with index (h 1 (α 1 ), h 2 (α 2 ), y 3 ), for a y 3 [p 1/3 ]: notice that each tupe is repicated p 1/3 times. The input servers hoding S 2 and S 3 proceed simiary with their tupes. After round 1, any three tupes S 1 (α 1, α 2 ), S 2 (α 2, α 3 ), S 3 (α 3, α 1 ) that contribute to the output tupe C 3 (α 1, α 2, α 3 ) wi be seen by the server y = (h 1 (α 1 ), h 2 (α 2 ), h 3 (α 3 )): any server that detects three matching tupes outputs them. Proposition 3.2. Fix a fractiona vertex cover v = (v 1,..., v k ) for a connected conjunctive query q, and et τ = i v i. The HC agorithm with share exponents e i = v i /τ computes q on any matching database in one round in MPC(ε), where ε = 1 1/τ, with probabiity of faiure η exp( O(n/p ε )). This proves the optimaity caim of Theorem 1.1: choose a vertex cover with vaue τ (q), the fractiona covering number of q. Proposition 3.2 shows that q can be computed in one round in MPC(ε), with ε = 1 1/τ. Proof. Since v forms a fractiona vertex cover, for every reation symbo S j we have i:xi vars(s j ) e i 1/τ. Therefore, i:xi vars(s j ) e i 1 1/τ. Every tupe S j (a j ) is repicated i:xi vars(s j ) p i p 1 1/τ times. Thus, the tota number of tupes that are received by a servers is O(n p 1 1/τ ). We caim that these tupes are uniformy distributed among the p servers: this proves the theorem, since then each server receives O(n/p 1/τ ) tupes. To prove the caim, we note that for each tupe t S j, the probabiity over the random choices of the hash functions h 1,..., h k that the tupe is sent to server s is precisey i:xi vars(s j ) p 1 i. Thus, the expected number of tupes from S j sent to s is n/ i:xi S j p i n/p 1 ε. Since S j is an a j - matching, different tupes are sent by the random hash functions to independent destinations, since any two tupes differ in every attribute. Using standard Chernoff bounds, we derive that the probabiity that the actua number of tupes per server deviates more than a constant factor from the expected number is η exp( O(n/p 1 ε )). 11

12 3.2 A Lower Bound for One Round For a fixed n, consider a probabiity distribution where the input I is chosen randomy, with uniform probabiity from a matching database instances. Let E[ q(i) ] denote the expected number of answers to the query q. We prove in this section 3 : Theorem 3.3. Let q be a connected conjunctive query, et τ be the fractiona covering number of q, and ε < 1 1/τ. Then, any deterministic MPC(ε) agorithm that runs in one communication round on p servers reports O(E[ q(i) ]/p τ (1 ε) 1 ) answers in expectation. In particuar, the theorem impies that the space exponent of q is at east 1 1/τ. Before we prove the theorem, we show how to extend it to randomized agorithms using Yao s principe. For this, we show a emma that we aso need ater. Lemma 3.4. The expected number of answers to connected query q is E[ q(i) ] = n 1+χ(q), where the expectation is over a uniformy chosen matching database I. Proof. For any reation S j, and any tupe a j [n] a j, the probabiity that S j contains a j is P(a j S j ) = n 1 a j. Given a tupe a [n] k of the same arity as the query answer, et a j denote its projection on the variabes in S j. Then: E[ q(i) ] = P( a [n] k = a [n] k = a [n] k = n k+ a (a j S j )) P(a j S j ) n 1 a j Since query q is connected, k + a = 1 + χ(q) and hence E[ q(i) ] = n 1+χ(q). Theorem 3.3 and Lemma 3.4, together with Yao s emma, impy the foowing ower bound for randomized agorithms. Coroary 3.5. Let q be any connected conjunctive query. Any one round randomized MPC(ε) agorithm with p = ω(1) and ε < 1 1/τ (q) fais to compute q with probabiity η = Ω(n χ(q) ) = n O(1). Proof. Choose a matching database I input to q uniformy at random. Let A(I) denote the set of correct answers returned by the agorithm on I: A(I) q(i). Observe that the agorithm fais on I iff q(i) A(I) > 0. 3 Reca that we have assumed that q has no unary reations. Otherwise, the theorem fais, as iustrated by the query q = S 1 (x), S 2 (x, y), S 3 (y), which has τ = 2, yet can be computed with space exponent ε = 0, on arbitrary databases (not ony matching databases). Indeed, notice that both unary reations S 1, S 3 require n bits to be represented (as bit vectors), whereas S 2 requires Ω(n og n) bits. Hence, if p og n, S 1, S 2 can be broadcast to every server. 12

13 Let γ = 1/p τ (q)(1 ε) 1. Since p = ω(1) and ε < 1 1/τ (q), it foows that γ = o(1). By Theorem 3.3, for any deterministic one round MPC(ε) agorithm we have E[ A(I) ] = O(γ)E[ q(i) ] and hence, by Lemma 3.4, However, we aso have that E[ q(i) A(I) ] = (1 o(1))e[ q(i) ] = (1 o(1))n 1+χ(q) E[ q(i) A(I) ] P[ q(i) A(I) > 0] max q(i) A(I). I Since q(i) A(I) q(i) n for a I, we see that the faiure probabiity of the agorithm for randomy chosen I, P[ q(i) A(I) > 0], is at east η = (1 o(1))n χ(q) which is n O(1) for any q. Yao s emma impies that every one round randomized MPC(ε) agorithm wi fai to compute q with probabiity at east η on some matching database input. In the rest of the section we prove Theorem 3.3, which deas with one-round deterministic agorithms and random matching databases I. Let us fix some server and et m(i) denote the function specifying the message the server receives on input I. Intuitivey, this server can ony report those tupes that it knows are in the input based on the vaue of m(i). To make this notion precise, for any fixed vaue m of m(i), define the set of tupes of a reation R of arity a known by the server given message m as K m (R) = {t [n] a for a matching databases I, m(i) = m t R(I)} We wi particuary appy this definition with R = S j and R = q. Ceary, an output tupe a K m (q) iff for every j, a j K m (S j ), where a j denotes the projection of a on the variabes in the atom S j. We wi first prove an upper bound for each K m (S j ) in Section Then in Section we use this bound, aong with Friedgut s inequaity, to estabish an upper bound for K m (q) and hence prove Theorem Bounding the Knowedge of Each Reation Let us fix a server, and an input reation S j. Observe that, for a randomy chosen matching database I, S j is a uniformy chosen a j -dimensiona matching. There are precisey (n!) aj 1 different a j -dimensiona matchings of arity a j and thus the number of bits N necessary to represent the reation is (a j 1) og(n!). Let m(s j ) be the part of the message m received from the server that corresponds to S j. The foowing emma provides a bound on the expected knowedge K m(sj )(S j ) the server may obtain from S j : Lemma 3.6. Suppose that for a (n!) a j 1 matchings S j of arity a j > 1, the message m(s j ) is at most f j (a j 1) og(n!) bits ong. Then E[ K m(sj )(S j ) ] f j n, where the expectation is taken over random choices of the matching S j. 13

14 In other words, if the message m(s j ) contains ony a fraction f j of the bits needed to encode S j, then a server receiving this message knows ony a fraction f j of the n tupes in S j. We wi appy the emma separatey to each input S j, for j = 1,..., in the next section, for appropriate choices of f j. Proof. Let m be a possibe vaue for m(s j ). Since m fixes precisey K m (S j ) tupes of S j, n K m (S j ) og {S j m(s j ) = m} (a j 1) i=1 og i (1 K m (S j ) /n)(a j 1) n i=1 og i = (1 K m (S j ) /n) og(n!) a j 1. (6) We can bound the vaue we want by considering the binary entropy of the distribution S j. By appying the chain rue for entropy, we have H(S j ) = H(m(S j )) + P(m(S j ) = m) H(S j m(s j ) = m) m f j H(S j ) + P(m(S j ) = m) H(S j m(s j ) = m) m f j H(S j ) + m = f j H(S j ) + (1 m P(m(S j ) = m) (1 K m(s j ) )H(S j ) n P(m(S j ) = m) K m(s j ) )H(S j ) n = f j H(S j ) + (1 E[ K m(s j )(S j ) ] )H(S j ) (7) n where the first inequaity foows from the assumed upper bound on m(s j ), the second inequaity foows by (6), and the ast two ines foow by definition. Dividing both sides of (7) by H(S j ) since H(S j ) is not zero and rearranging we obtain the required statement Bounding the Knowedge of the Query Let c be a constant such that each server receives at most cn/p 1 ε bits. Let us aso fix some server. The message m = m(i) received by the server is the concatenation of messages, one for each input reation. K m (S j ) depends ony on m(s j ), so we can assume w..o.g. that K m (S j ) = K m(sj )(S j ). In order to represent the tota input I, we need (a j 1) og(n!) = (a ) og(n!) bits. Hence, the message m(i) wi contain at most c(a ) og(n!)/p 1 ε bits. Now, for each reation S j, et us define f j = max S j m(s j ) (a j 1) og(n!). Note that this is we-defined, since a j > 1. Thus, f j is the argest fraction of bits of S j that the server receives, over a choices of the matching S j. We now derive an upper bound on the f j s. 14

15 As we discussed before, each part m(s j ) of the message is constructed independenty of the other reations. Hence, it must be that max m(s j ) c(a ) og(n!)/p 1 ε S j By substituting the definition of f j in this equation, we obtain that f j(a j 1) c(a )/p 1 ε. Lemma 3.6 further impies that, for a randomy chosen matching database I, E[ K m(i) (S j ) ] = E[ K m(sj )(S j ) ] f j n for a j []. We prove: Lemma 3.7. E[ K m(i) (q) ] g q,c E[ q(i) ]/p (1 ε)τ for randomy chosen matching database I, where g q,c = (c(a )/τ ) τ is a constant that depends ony on the constant c and the query q. This proves Theorem 3.3, since we can appy a union bound to show that the tota number of tupes known by a p servers is bounded by: p E[ K m(i) (q) ] p g q,c E[ q(i) ]/p (1 ε)τ which is the upper bound in Theorem 3.3 since g q,c is a constant when q is fixed. In the rest of the section we prove Lemma 3.7. We start with some notation. For a j [n] a j, et w j (a j ) denote the probabiity that the server knows the tupe a j. In other words w j (a j ) = P(a j K mj (S j )(S j )), where the probabiity is over the random choices of S j. Lemma 3.8. For any reation S j of arity a j > 1: (a) a j [n] a j : w j (a j ) n 1 a j, and (b) aj [n] a j w j (a j ) f j n. Proof. To show (a), notice that w j (a j ) P(a j aj [n] a j w j (a j ) = E[ K mj (S j )(S j ) ] f j n. S j ) = n 1 a j, whie (b) foows from the fact Since the server receives a separate message for each reation S j, from a distinct input server, the events a 1 K m1 (S 1 ),..., a K m (S ) are independent, hence: E[ K m(i) (q) ] = a [n] k P(a K m(i) (q)) = a [n] k w j (a j ) We now prove Lemma 3.7 using Friedgut s inequaity. Reca that in order to appy the inequaity, we need to find a fractiona edge cover. Fix an optima fractiona edge packing u = (u 1,..., u ) as in Fig. 1. By duaity, we have that j u j = τ, where τ is the fractiona covering number (which is the vaue of the optima fractiona vertex cover, and equa to the vaue of the optima fractiona edge packing). Given q, defined as in (1), consider the extended query, which has a new unary atom for each variabe x i : q (x 1,..., x k ) = S 1 ( x 1 ),..., S ( x ), T 1 (x 1 ),..., T k (x k ) For each new symbo T i, define u i = 1 j:x i vars(s j ) u j. Since u is a packing, u i 0. Let us define u = (u 1,..., u k ). 15

16 Lemma 3.9. (a) The assignment (u, u ) is both a tight fractiona edge packing and a tight fractiona edge cover for q. (b) a ju j + k i=1 u i = k Proof. (a) is straightforward, since for every variabe x i we have u i + j:x i vars(s j ) u j = 1. Summing up: which proves (b). k = k u i + i=1 j:x i vars(s j ) u j = k i=1 u i + a j u j We wi appy Friedgut s inequaity to the extended query q to prove Lemma 3.7. Set the variabes w( ) used in Friedgut s inequaity as foows: w j (a j ) =P(a j K mj (S j )(S j )) for S j, tupe a j [n] a j w i (α) =1 for T i, vaue α [n] Reca that, for a tupe a [n] k we use a j [n] a j for its projection on the variabes in S j ; with some abuse, we write a i [n] for the projection on the variabe x i. Assume first that u j > 0, for j = 1,. Then: E[ K m (q) ] = a [n] k = = a [n] k a [n] a j a [n] a j w j (a j ) w j (a j ) k i=1 w i (a i) w j (a) 1/u j w j (a) 1/u j u j ( ) u k i w i (α)1/u i i=1 α [n] u j k n u i i=1 Note that, since w i (α) = 1 we have w i (α)1/u i = 1 even if u i = 0. Write w j(a) 1/u j = w j (a) 1/u j 1 w j (a), and use Lemma 3.8 to obtain: a [n] a j w j (a) 1/u j (n 1 a j ) 1/u j 1 a [n] a j n (1 a j)(1/u j 1) f j n = f j n (a j a j /u j +1/u j ) w j (a) 16

17 Pugging this in the bound, we have shown that: E[ K m (q) ] = = = = ( f j n (a j a j /u j +1/u j ) ) u j k n u i i=1 f u j j n ( a ju j a+) n k i=1 u i f u j j n ( a) n ( a ju j +i=1 k u i ) f u j j n +k a = f u j j n 1+χ(q) f u j j E[ q(i) ] (8) If some u j = 0, then we can derive the same ower bound as foows: We can repace each u j with u j + δ for any δ > 0 sti yieding an edge cover. Then we have j a j u j + i u i = k + aδ, and hence an extra factor n aδ mutipying the term n +k a in (8); however, we obtain the same upper bound since, in the imit as δ approaches 0, this extra factor approaches 1. Let f q = f u j j ; the fina step is to upper bound the quantity f q using the fact that f j(a j 1) c(a )/p 1 ε. Indeed: og f q = = u j og f j u j og f j(a j 1) + u j τ u j τ og f j(a j 1) u j τ og f j(a j 1) τ og τ c(a ) τ p 1 ε u j og u j a j 1 Here, the first inequaity comes from the fact that u j 1 and a j 1 1 (since the arity is at east 2), and hence og u j a j 1 0, for a j = 1,...,. The second inequaity foows from Jensen s inequaity and concavity of og. Thus, we obtain f q = f u j j ( c(a ) τ ) τ 1 p (1 ε)τ (9) Reca that we have defined g q,c = (c(a )/τ ) τ. Thus, combining (8) with (9) concudes the proof of Lemma

18 3.3 Extensions Proposition 3.2 and Theorem 3.3 impy that, over matching databases, the space exponent of a query q is 1 1/τ, where τ is its fractiona covering number. Tabe 1 iustrates the space exponent for various famiies of conjunctive queries. We now discuss a few extensions and coroaries. As a coroary of Theorem 3.3 we can characterize the queries with space exponent zero, i.e. those that can be computed in a singe round without any repication. Coroary A query q has covering number τ (q) = 1 if and ony if there exists a variabe shared by a atoms. Proof. The if direction is straightforward: if x i occurs in a atoms, then v i = 1 and v j = 0 for a j = i is a fractiona vertex cover with vaue 1, proving τ = 1 since τ (q) 1 for any q. For ony if, assume τ (q) = 1, and consider a fractiona vertex cover for which k i=1 v i = 1. We prove that there exists a variabe x that occurs in a atoms. If not, then every variabe x j is missing from at east one atom S: since i:xi vars(s) v i 1, it foows that v j = 0, for a variabes v j, which is a contradiction. Thus, a query can be computed in one round on MPC(0) if and ony if it has a variabe occurring in a atoms. The coroary shoud be contrasted with the resuts in [17], which proved that a query is computabe in one round iff it is ta-fat. Any connected ta-fat query has a variabe occurring in a atoms, but the converse is not true in genera. The agorithm in [17] works for any input data, incuding skewed inputs, whie here we restrict to matching databases. For exampe, S 1 (x, y), S 2 (x, y), S 3 (x, z) can be computed in one round if a inputs are permutations, but it is not ta-fat, and hence it cannot be computed in one round on genera input data. Theorem 3.3 tes us that a query q can report at most a 1/p τ (q)(1 ε) 1 fraction of answers. We show that there is an agorithm achieving this for matching databases: Proposition Given a connected query q and ε < 1 1/τ (q), there exists an agorithm that reports Θ(E[ q(i) ]/p τ (q)(1 ε) 1 ) answers in expectation on any matching database in one round in MPC(ε). Proof. The agorithm we describe here is simiar to the HC agorithm described in Section 3.1. For each variabe x i, define the shares p i = p (1 ε)v i, where v = (v 1,..., v k ) is the optima fractiona vertex cover. We use random hash functions h i : [n] [p i ] for each i = 1,..., k. This creates p (1 ε)τ (q) hashing buckets in the k-dimensiona hypercube [n] p 1 [n] p k. Notice that, since ε < 1 1/τ (q), the number of points in the hypercube is stricty greater than p, so it is not possibe to assign each point to one of the p servers. Instead, we wi pick p points uniformy at random and assign each of the p servers to one of the points in the hypercube. Since each potentia output tupe is hashed to a random point of the hypercube, the probabiity that a potentia output tupe is covered by one of the servers is p/p (1 ε)τ (q). Our agorithm wi execute exacty as the HC agorithm, but wi communicate ony tupes that are hashed to one of the chosen hypercube points (and thus to one of the servers). By our previous discussion, the agorithm reports in expectation p 1 (1 ε)τ (q) E[ q(i) ] tupes. To concude the proof, it suffices to show that each of the servers wi receive O(n/p 1 ε ) tupes. Indeed, using a simiar anaysis to the HC agorithm, each server receives from a reation S j in expectation n/ i:xi vars(s j ) p i tupes. Since v is a fractiona cover, we have that i:xi vars(s j ) p i = 18

19 p (1 ε) i:x i vars(s j ) v i p 1 ε. Thus, the number of tupes received by each server is in expectation O(n/p 1 ε ). Moreover, foowing the same argument as in the proof of Proposition 3.2, the probabiity that the number of tupes per server deviates more than a constant factor from the expectation is exponentiay sma to n. Note that the agorithm is forced to run in one round, in an MPC(ε) mode stricty weaker than its space exponent, hence it cannot find a the answers: the proposition says that the agorithm can find an expected number of answers that matches Theorem 3.3. So far, our ower bounds were for the JOIN-REPORTING probem. We can extend the ower bounds to the JOIN-WITNESS probem. Proposition For ε < 1/2, there exists no one-round MPC(ε) agorithm that soves JOIN-WITNESS for the query q(w, x, y, z) = R(w), S 1 (w, x), S 2 (x, y), S 3 (y, z), T(z). Proof. Consider the famiy of inputs where S 1, S 2, S 3 are 2-dimensiona matchings, whie R, T are uniformy at random chosen subsets of [n] of size n. It is easy to see that for this input distribution I, E[ q(i) ] = 1, since the probabiity that each answer of the subquery q = S 1, S 2, S 3 is incuded in the fina output is (1/ n) (1/ n) = 1/n, whie q has exacty n answers. Since the size of R, T is n and n p, we can assume w..o.g. that both reations are broadcast to a servers during the first round. Further, we can assume w..o.g. as before that S 1, S 2, S 3 are initiay stored in three separate input servers. Fix some server s; by Theorem 3.3, in expectation the server containsony O(E[ q (I) ]/p 2(1 ε) ) tupes from the subquery q. Since R, T are known to the server, and E[ q (I) ] = n, the server wi know in expectation for (1/ n) (1/ n)o(n/p 2(1 ε) ) = O(1/p 2(1 ε) ) tupes from q. Consequenty, the servers know in tota for O(1/p 2(1 ε) 1 ) output tupes in expectation. Since ε < 1/2, and the input has in expectation one answer, we can argue as in Coroary 3.5 that the probabiity that any agorithm wi provide the tupe as a witness is poynomiay sma. 4 Mutipe Communication Steps In this section we consider a restricted version of the MPC(ε) mode, caed the tupe-based MPC(ε) mode, which can simuate muti-round MapReduce for database queries. We wi estabish both upper and ower bounds on the number of rounds needed to compute any connected query q in this tupe-based MPC(ε) mode, proving Theorem An Agorithm for Mutipe Rounds Given an ε 0, et Γ 1 ε denote the cass of connected queries q for which τ (q) 1/(1 ε); these are precisey the queries that can be computed in one round in the MPC(ε) mode on matching databases. We extend this definition inductivey to arger numbers of rounds: Given Γ r ε for some r 1, define Γ r+1 ε to be the set of a connected queries q constructed as foows. Let q 1,..., q m Γ r ε be m queries, and et q 0 Γ 1 ε be a query over a different vocabuary V 1,..., V m, such that vars(q j ) = arity(v j ) for a j [m]. Then, the query q = q 0 [q 1 /V 1,..., q m /V m ], obtained by substituting each view V j in q 0 with its definition q j, is in Γ r+1 ε. In other words, Γ r ε consists of queries that have a query pan of depth r, where each operator is a query computabe in one step. 19

Communication Steps for Parallel Query Processing

Communication Steps for Parallel Query Processing Paul Beame, Paraschos Koutris and Dan Suciu University of Washington, Seattle, WA {beame,pkoutris,suciu}@cs.washington.edu ABSTRACT We consider the problem