Tight Bounds for Distributed Functional Monitoring

Size: px
Start display at page:

Download "Tight Bounds for Distributed Functional Monitoring"

Transcription

1 Tight Bounds for Distributed Functiona Monitoring David P. Woodruff IBM Amaden Qin Zhang IBM Amaden Abstract We resove severa fundamenta questions in the area of distributed functiona monitoring, initiated by Cormode, Muthukrishnan, and Yi (SODA, 2008), and receiving recent attention. In this mode there are k sites each tracking their input streams and communicating with a centra coordinator. The coordinator s task is to continuousy maintain an approximate output to a function computed over the union of the k streams. The goa is to minimize the number of bits communicated. Let the p-th frequency moment be defined as F p = i f p i, where f i is the frequency of eement i. We show the randomized communication compexity of estimating the number of distinct eements (that is, F 0 ) up to a 1 + ε factor is Ω(k/ε 2 ), improving upon the previous Ω(k + 1/ε 2 ) bound and matching known upper bounds up to a ogarithmic factor. For F p, p > 1, we improve the previous Ω(k + 1/ε 2 ) bits communication bound to Ω(k p 1 /ε 2 ). We obtain simiar improvements for heavy hitters, empirica entropy, and other probems. Our ower bounds are the first of any kind in distributed functiona monitoring to depend on the product of k and 1/ε 2. Moreover, the ower bounds are for the static version of the distributed functiona monitoring mode where the coordinator ony needs to compute the function at the time when a k input streams end; surprisingy they amost match what is achievabe in the (dynamic version of) distributed functiona monitoring mode where the coordinator needs to keep track of the function continuousy at any time step. We aso show that we can estimate F p, for any p > 1, using Õ(kp 1 poy(ε 1 )) bits of communication. This drasticay improves upon the previous Õ(k2p+1 N 1 2/p poy(ε 1 )) bits bound of Cormode, Muthukrishnan, and Yi for genera p, and their Õ(k2 /ε + k 1.5 /ε 3 ) bits bound for p = 2. For p = 2, our bound resoves their main open question. Our ower bounds are based on new direct sum theorems for approximate majority, and yied improvements to cassica probems in the standard data stream mode. First, we improve the known ower bound for estimating F p, p > 2, in t passes from Ω(n 1 2/p /(ε 2/p t)) to Ω(n 1 2/p /(ε 4/p t)), giving the first bound that matches what we expect when p = 2 for any constant number of passes. Second, we give the first ower bound for estimating F 0 in t passes with Ω(1/(ε 2 t)) bits of space that does not use the hardness of the gap-hamming probem. 1 Introduction Recent appications in sensor networks and distributed systems have motivated the distributed functiona monitoring mode, initiated by Cormode, Muthukrishnan, and Yi [20]. In this mode there are k sites and a singe centra coordinator. Each site S i (i [k]) receives a stream of data A i (t) for timesteps t = 1, 2,..., and the coordinator wants to keep track of a function f that is defined over the mutiset union of the k data streams at each time t. For exampe, the function f coud be the number of distinct eements in the union Most of this work was done whie Qin Zhang was a postdoc in MADALGO (Center for Massive Data Agorithmics - a Center of the Danish Nationa Research Foundation), Aarhus University. 1

2 of the k streams. We assume that there is a two-way communication channe between each site and the coordinator so that the sites can communicate with the coordinator. The goa is to minimize the tota amount of communication between the sites and the coordinator so that the coordinator can approximatey maintain f(a 1 (t),..., A k (t)) at any time t. Minimizing the tota communication is motivated by power constraints in sensor networks, since communication typicay uses a power-hungry radio [25]; and aso by network bandwidth constraints in distributed systems. There is a arge body of work on monitoring probems in this mode, incuding maintaining a random sampe [21, 50], estimating frequency moments [18, 20], finding the heavy hitters [5, 42, 45, 54], approximating the quanties [19, 35, 54], and estimating the entropy [4]. We can think of the distributed functiona monitoring mode as foows. Each of the k sites hods an N-dimentiona vector where N is the size of the universe. An update to a coordinate j on site S i causes vj i to increase by 1. The goa is to estimate a statistic of v = k i=1 vi, such as the p-th frequency moment F p = v i v 1 og v 1 v i. v p p, the number of distinct eements F 0 = support(v), and the empirica entropy H = i This is the standard insertion-ony mode. For many of these probems, with the exception of the empirica entropy, there are strong ower bounds (e.g., Ω(N)) if aowing updates to coordinates that cause vj i to decrease [4]. The atter is caed the update mode. Thus, except for entropy, we foow previous work and consider the insertion-ony mode. To prove ower bounds, we consider the static version of the distributed functiona monitoring mode, where the coordinator ony needs to compute the function at the time when a k input streams end. It is cear that a ower bound for the static case is aso a ower bound for the dynamic case in which the coordinator has to keep track of the function at any point in time. The static version of the distributed functiona monitoring mode is cosey reated to the mutiparty number-in-hand communication mode, where we again have k sites each hoding an N-dimensiona vector v i, and they want to jointy compute a function defined on the k input vectors. It is easy to see that these two modes are essentiay the same since in the former, if site S i woud ike to send a message to S j, it can aways send the message first to the coordinator and then the coordinator can forward the message to S j. Doing this wi ony increase the tota amount of communication by a factor of two. Therefore, we do not distinguish between these two modes in this paper. There are two variants of the mutiparty number-in-hand communication mode we wi consider: the backboard mode, in which each message a site sends is received by a other sites, i.e., it is broadcast, and the message-passing mode, in which each message is between the coordinator and a specific site. Despite the arge body of work in the distributed functiona monitoring mode, the compexity of basic probems is not we understood. For exampe, for estimating F 0 up to a (1 + ε)-factor, the best upper bound is Õ(k/ε2 ) 1 [20] (a communication and information bounds in this paper, if not otherwise stated, are in terms of bits), whie the ony known ower bound is Ω(k + 1/ε 2 ). The dependence on ε in the ower bound is not very insightfu, as the Ω(1/ε 2 ) bound foows just by considering two sites [4, 16]. The rea question is whether the k and 1/ε 2 factors shoud mutipy. Even more embarrassingy, for the frequency moments F p, p > 2, the known agorithms use communication Õ(k2p+1 N 1 2/p poy(1/ε)), whie the ony known ower bound is Ω(k + 1/ε 2 ) [4, 16]. Even for p = 2, the best known upper bound is Õ(k2 /ε + k 1.5 /ε 3 ) [20], and the authors main open question in their paper is It remains to cose the gap in the F 2 case: can a better ower bound than Ω(k) be shown, or do there exist Õ(k poy(1/ε)) soutions? Our Resuts: We significanty improve the previous communication bounds for approximating the frequency moments, entropy, heavy hitters, and quanties in the distributed functiona monitoring mode. In many cases our bounds are optima. Our resuts are summarized in Tabe 1, where they are compared with previous bounds. We have three main resuts, each introducing a new technique: 1 We use Õ(f) to denote a function of the form f ogo(1) (Nk/ε). 2

3 Previous work This paper Previous work This paper Probem LB LB (a static) UB UB F 0 Ω(k) [20] Ω(k/ε 2 ) Õ(k/ε 2 ) [20] F 2 Ω(k) [20] Ω(k/ε 2 ) (BB) Õ(k 2 /ε + k 1.5 /ε 3 ) [20] Õ( k p poy(ε) ) F p (p > 1) Ω(k + 1/ε 2 ) [4, 16] Ω(k p 1 /ε 2 ) (BB) Õ( k 2p+1 N 1 2/p ) [20] Õ( kp 1 ε 1+2/p poy(ε) ) A-quantie Ω(min{ k ε, 1 }) [35] Ω(min{ k ε 2 ε, 1 }) (BB) Õ(min{ k ε 2 ε, 1 }) [35] ε 2 Heavy Hitters Ω(min{ k ε, 1 }) [35] Ω(min{ k ε 2 ε, 1 }) (BB) Õ(min{ k ε 2 ε, 1 }) [35] ε 2 Entropy Ω(1/ ε) [4] Ω(k/ε 2 ) (BB) Õ( k ) [4], ε Õ( k ) (static) [33] 3 ε 2 p (p (0, 2]) Ω(k/ε 2 ) (BB) Õ(k/ε 2 ) (static) [40] Tabe 1: UB denotes upper bound; LB denotes ower bound; BB denotes backboard mode. N denotes the universe size. A bounds are for randomized agorithms. We assume a bounds hod in the dynamic setting by defaut, and wi state expicity if they hod in the static setting. For ower bounds we assume the message-passing mode by defaut, and state expicity if they aso hod in the backboard mode. 1. We show that for estimating F 0 in the message-passing mode, Ω(k/ε 2 ) communication is required, matching an upper bound of [20] up to a poyogarithmic factor. Our ower bound hods in the static mode in which the k sites just need to approximate F 0 once on their inputs. 2. We show that we can estimate F p, for any p > 1, using Õ(kp 1 poy(ε 1 )) communication in the message-passing mode 2. This drasticay improves upon the previous bound Õ(k2p+1 N 1 2/p poy(ε 1 )) of [20]. In particuar, setting p = 2, we resove the main open question of [20]. 3. We show Ω(k p 1 /ε 2 ) communication is necessary for approximating F p (p > 1) in the backboard mode, significanty improving the prior Ω(k + 1/ε 2 ) bound. As with our ower bound for F 0, these are the first ower bounds which depend on the product of k and 1/ε. As with F 0, our ower bound hods in the static mode in which the sites just approximate F p once. Our other resuts in Tabe 1 are expained in the body of the paper, and use simiar techniques. We woud ike to mention that after the conference version of our paper, our resuts found appications in proving a space ower bound at each site for tracking heavy hitters in the functiona monitoring mode [36], and a communication compexity ower bound of computing ε-approximations of range spaces in R 2 in the message-passing mode [34]. Our Techniques: Lower Bound for F 0 : For iustration, suppose k = 1/ε 2. There are 1/ε 2 sites each hoding a random independent bit. Their task is to approximate the sum of the k bits up to an additive error 1/ε. Ca this probem k-approx-sum. 3 We show any correct protoco must revea Ω(1/ε 2 ) bits of information about the sites inputs. We compose this with 2-party disjointness (2-DISJ) [48], in which each party has a bitstring of ength 1/ε 2 and either the strings have disjoint support (the soution is 0) or there is a singe coordinate which is 1 in both strings (the soution is 1). Let τ be the hard distribution for 2-DISJ, shown to require Ω(1/ε 2 ) bits of communication to sove [48]. Suppose the coordinator and each site share an instance of 2-DISJ in which the soution to 2-DISJ is a random bit, which is the site s effective input to k-approx-sum. The coordinator has the same input for each of the 1/ε 2 instances, 2 We assume the tota number of updates is poy(n). 3 In the conference version of this paper we introduced a probem caed k-gap-maj, in which sites need to decide if at east 1/(2ε 2 ) + 1/ε of the bits are 1, or at most 1/(2ε 2 ) 1/ε of the bits are 1. We instead use k-approx-sum here since we fee it is easier to work with: This probem is stronger than k-gap-maj thus is easier to ower bound, and it suffices for our purpose. k-gap-maj wi be introduced and used in Section 6.1 for heavy-hitters and quanties. 3

4 whie the sites have an independent input drawn from τ conditioned on the coordinator s input and output bit determined by k-approx-sum. The inputs are chosen so that if the output of 2-DISJ is 1, then F 0 increases by 1, otherwise it remains the same. This is not entirey accurate, but it iustrates the main idea. Now, the key is that by the rectange property of k-party communication protocos, the 1/ε 2 different output bits are independent conditioned on the transcript. Thus if a protoco does not revea Ω(1/ε 2 ) bits of information about these output bits, by an anti-concentration theorem we can show that the protoco cannot succeed with arge probabiity. Finay, since a (1 + ε)-approximation to F 0 can decide k-approx-sum, and since any correct protoco for k-approx-sum must revea Ω(1/ε 2 ) bits of information, the protoco must sove Ω(1/ε 2 ) instances of 2-DISJ, each requiring Ω(1/ε 2 ) bits of communication (otherwise the coordinator coud simuate k 1 of the sites and obtain an o(1/ε 2 )- communication protoco for 2-DISJ with the remaining site, contradicting the communication ower bound for 2-DISJ on this distribution). We obtain an Ω(k/ε 2 ) bound for k 1/ε 2 by using simiar arguments. One cannot show this in the backboard mode since there is an Õ(k + 1/ε2 ) bound for F 0 4. Lower Bound for F p : Our Ω(k p 1 /ε 2 ) bound for F p cannot use the above reduction since we do not know how to turn a protoco for approximating F p into a protoco for soving the composition of k-approx- SUM and 2-DISJ. Instead, our starting point is a recent Ω(1/ε 2 ) ower bound for the 2-party gap-hamming distance probem GHD [16]. The parties have a ength-1/ε 2 bitstring, x and y, respectivey, and they must decide if the Hamming distance (x, y) > 1/(2ε 2 ) + 1/ε or (x, y) < 1/(2ε 2 ) 1/ε. A simpification by Sherstov [49] shows a reated probem caed 2-GAP-ORT aso has communication compexity of Ω(1/ε 2 ) bits. Here there are two parties, each with 1/ε 2 -ength bitstrings x and y, and they must decide if (x, y) 1/(2ε 2 ) > 2/ε or (x, y) 1/(2ε 2 ) < 1/ε. Chakrabarti et a. [15] showed that any correct protoco for 2-GAP-ORT must revea Ω(1/ε 2 ) bits of information about (x, y). By independence and the chain rue, this means for Ω(1/ε 2 ) indices i, Ω(1) bits of information is reveaed about (x i, y i ) conditioned on vaues (x j, y j ) for j < i. We now embed an independent copy of a variant of k-party-disjointness, the k-xor probem, on each of the 1/ε 2 coordinates of 2-GAP-ORT. In this variant, there are k parties each hoding a bitstring of ength k p. On a but one specia randomy chosen coordinate, there is a singe site assigned to the coordinate and that site uses private randomness to choose whether the vaue on the coordinate is 0 or 1 (with equa probabiity), and the remaining k 1 sites have 0 on this coordinate. On the specia coordinate, with probabiity 1/4 a sites have a 0 on this coordinate (a 00 instance), with probabiity 1/4 the first k/2 parties have a 1 on this coordinate and the remaining k/2 parties have a 0 (a 10 instance), with probabiity 1/4 the second k/2 parties have a 1 on this coordinate and the remaining k/2 parties have a 0 (a 01 instance), and with the remaining probabiity 1/4 a k parties have a 1 on this coordinate (a 11 instance). We show, via a direct sum for distributiona communication compexity, that any deterministic protoco that decides which case the specia coordinate is in with probabiity 1/4 + Ω(1) has conditiona information cost Ω(k p 1 ). This impies that any protoco that can decide whether the output is in the set {10, 01} (the XOR of the output bits) with probabiity 1/2+Ω(1) has conditiona information cost Ω(k p 1 ). We do the direct sum argument by conditioning the mutua information on ow-entropy random variabes which aow us to fi in inputs on remaining coordinates without any communication between the parties and without asymptoticay affecting our Ω(k p 1 ) ower bound. We design a reduction so that on the i-th coordinate of 2-GAP-ORT, the input of the first k/2-payers of k-xor is determined by the pubic coin (which we condition on) and the first party s input bit to 2-GAP-ORT, and the input of the second k/2-payers of k- XOR is determined by the pubic coin and the second party s input bit to 2-GAP-ORT. We show that any protoco that soves the composition of 2-GAP-ORT with 1/ε 2 copies of k-xor, a probem that we ca k- 4 The idea is to first obtain a 2-approximation. Then, sub-sampe so that there are Θ(1/ε 2 ) distinct eements. Then the first party broadcasts his distinct eements, the second party broadcasts the distinct eements he has that the first party does not, etc. 4

5 BTX, must revea Ω(1) bits of information about the two output bits of an Ω(1) fraction of the 1/ε 2 copies, and from our Ω(k p 1 ) information cost ower bound for a singe copy, we can obtain an overa Ω(k p 1 /ε 2 ) bound. Finay, one can show that a (1 + ε)-approximation agorithm for F p can be used to sove k-btx. Upper Bound for F p : We iustrate the agorithm for p = 2 and constant ε. Unike [20], we do not use AMS sketches [3]. A nice property of our protoco is that it is the first 1-way protoco (the protoco of [20] is not), in the sense that ony the sites send messages to the coordinator (the coordinator does not send any messages). Moreover, a messages are simpe: if a site receives an update to the j-th coordinate, provided the frequency of coordinate j in its stream exceeds a threshod, it decides with a certain probabiity to send j to the coordinator. Unfortunatey, one can show that this probabiity cannot be the same for a coordinates j, as otherwise the communication woud be too arge. To determine the threshod and probabiity to send an update to a coordinate j, the sites use the pubic coin to randomy group a coordinates j into buckets S, where S contains a 1/2 fraction of the input coordinates. For j S, the threshod and probabiity are ony a function of. Inspired by work on subsamping [37], we try to estimate the number of coordinates j of magnitude in the range [2 h, 2 h+1 ), for each h. Ca this cass of coordinates C h. If the contribution to F 2 from C h is significant, then C h 2 2h F 2, and to estimate C h we ony consider those j C h that are in S for a vaue which satisfies C h 2 2 2h F We do not know F 2 and so we aso do not know, but we can make a ogarithmic number of guesses. We note that the work [37] was avaiabe to the authors of [20] for severa years, but adapting it to the distributed framework here is tricky in the sense that the heavy hitters agorithm used in [37] for finding eements in different C h needs to be impemented in a k-party communication-efficient way. When choosing the threshod and probabiity we have two competing constraints; on the one hand these vaues must be chosen so that we can accuratey estimate the vaues C h from the sampes. On the other hand, these vaues need to be chosen so that the communication is not excessive. Baancing these two constraints forces us to use a threshod instead of just the same probabiity for a coordinates in S. By choosing the threshods and probabiities to be appropriate functions of, we can satisfy both constraints. Other minor issues in the anaysis arise from the fact that different casses contribute at different times, and that the coordinator must be correct at a times. These issues can be resoved by conditioning on a quantity reated to the protoco s correctness being accurate at a sma number of seected times in the stream, and then arguing that the quantity is non-decreasing and that this impies that it is correct at a times. Impications for the Data Stream Mode: In 2003, Indyk and Woodruff introduced the GHD probem [38], where a 1-round ower bound shorty foowed [52]. Ever since, it seemed the space compexity of estimating F 0 in a data stream with t > 1 passes hinged on whether GHD required Ω(1/ε 2 ) communication for t rounds, see, e.g., Question 10 in [2]. A furry [9, 10, 16, 49, 51] of recent work finay resoved the compexity of GHD. What our ower bound shows for F 0 is that this is not the ony way to prove the Ω(1/ε 2 ) space bound for mutipe passes for F 0. Indeed, we just needed to ook at Θ(1/ε 2 ) parties instead of 2 parties. Since we have an Ω(1/ε 4 ) communication ower bound for F 0 with Θ(1/ε 2 ) parties, this impies an Ω((1/ε 4 )/(t/ε 2 )) = Ω(1/(tε 2 )) bound for t-pass agorithms for approximating F 0. Arguaby our proof is simper than the recent GHD ower bounds. Our Ω(k p 1 /ε 2 ) bound for F p aso improves a ong ine of work on the space compexity of estimating F p for p > 2 in a data stream. The current best upper bound is Õ(N 1 2/p ε 2 ) bits of space [28]. See Figure 1 of [28] for a ist of papers which make progress on the ε and ogarithmic factors. The previous best ower bound is Ω(N 1 2/p ε 2/p /t) for t passes [7]. By setting k p = ε 2 N, we obtain that the tota communication is at east Ω(ε 2 2/p N 1 1/p /ε 2 ), and so the impied space ower bound for t-pass agorithms for F p in a 5

6 data stream is Ω(ε 2/p N 1 1/p /(tk)) = Ω(N 1 2/p /(ε 4/p t)). This gives the first bound that agrees with the tight Θ(1/ε 2 ) bound when p = 2 for any constant t. After our work, Ganguy [29] improved this for the specia case t = 1. That is, for 1-pass agorithms for estimating F p, p > 2, he shows a space ower bound of Ω(N 1 2/p /(ε 2 og n)). Other Reated Work: There are quite a few papers on mutiparty number-in-hand communication compexity, though they are not directy reevant for the probems studied in this paper. Aon et a. [3] and Bar-Yossef et a. [7] studied ower bounds for mutiparty set-disjointness, which has appications to p-th frequency moment estimation for p > 2 in the streaming mode. Their resuts were further improved in [14, 31, 39]. Chakrabarti et a. [12] studied random-partition communication ower bounds for mutiparty set-disjointness and pointer jumping, which have a number of appications in the random-order data stream mode. Other work incudes Chakrabarti et a. [13] for median seection, Magniez et a. [44] and Chakrabarti et a. [11] for streaming anguage recognition. Very few studies have been conducted in the message-passing mode. Duris and Roim [23] proved severa ower bounds in the message-passing mode, but ony for some simpe booean functions. Three reated but more restrictive private-message modes were studied by Ga and Gopaan [27], Ergün and Jowhari [24], and Guha and Huang [32]. The first two ony investigated deterministic protocos and the third was taiored for the random-order data stream mode. Recenty Phiips et a. [47] introduced a technique caed symmetrization for the number-in-hand communication mode. The idea is to try to find a symmetric hard distribution for the k payers. Then one reduces the k-payer probem to a 2-payer probem by assigning Aice the input of a random payer and Bob the inputs of the remaining k 1 payers. The answer to the k-payer probem gives the answer to the 2-payer probem. By symmetrization one can argue that if the communication ower bound for the resuting 2-payer probem is L, then the ower bound for the k-payer probem is Ω(kL). Whie symmetrization deveoped in [47] can be used to sove some probems for which other techniques are not known, such as bitwise AND/OR and graph connectivity, it has severa imitations. First, symmetrization requires a symmetric hard distribution, and for many probems (e.g., F p (p > 1) in this paper) this is not known or unikey to exist. Second, for many probems (e.g., F 0 in this paper), we need a direct-sum type of argument with certain combining functions (e.g., the majority (MAJ)), whie in [47], ony outputting a copies or with the combining function OR is considered. Third, the symmetrization technique in [47] does not give information cost bounds, and so it is difficut to use when composing probems as is done in this paper. In this paper, we have further deveoped symmetrization to make it work with the combining function MAJ and the information cost. Paper Outine: In Section 3 and Section 4 we prove our ower bounds for F 0 and F p, p > 1. The ower bounds appy to functiona monitoring, but hod even in the static mode. In Section 5 we show improved upper bounds for F p, p > 1, for functiona monitoring. Finay, in Section 6 we prove ower bounds for a-quantie, heavy hitters, entropy and p for any p 1 in the backboard mode. 2 Preiminaries In this section we review some basics on communication compexity and information theory. Information Theory We refer the reader to [22] for a comprehensive introduction to information theory. Here we review a few concepts and notations. Let H(X) denote the Shannon entropy of the random variabe X, and et H b (p) denote the binary entropy function when p [0, 1]. Let H(X Y ) denote conditiona entropy of X given Y. Let I(X; Y ) denote the mutua information between two random variabes X, Y. Let I(X; Y Z) denote the mutua 6

7 information between two random variabes X, Y conditioned on Z. The foowing is a summarization of the basic properties of entropy and mutua information that we need. Proposition 1 Let X, Y, Z, W be random variabes. 1. If X takes vaue in {1, 2,..., m}, then H(X) [0, og m]. 2. H(X) H(X Y ) and I(X; Y ) = H(X) H(X Y ) If X and Z are independent, then we have I(X; Y Z) I(X; Y ). Simiary, if X, Z are independent given W, then I(X; Y Z, W ) I(X; Y W ). 4. (Chain rue of mutua information) I(X, Y ; Z) = I(X; Z) + I(Y ; Z X). And in genera, for any random variabes X 1, X 2,..., X n, Y, I(X 1,..., X n ; Y ) = n i=1 I(X i; Y X 1,..., X i 1 ). Thus, I(X, Y ; Z W ) I(X; Z W ). 5. (Data processing inequaity) If X and Z are conditionay independent given Y, then I(X; Y Z, W ) I(X; Y W ). 6. (Fano s inequaity) Let X be a random variabe chosen from domain X according to distribution µ X, and Y be a random variabe chosen from domain Y according to distribution µ Y. For any reconstruction function g : Y X with error δ g, H b (δ g ) + δ g og( X 1) H(X Y ). 7. (The Maximum Likeihood Estimation principe) With the notations as in Fano s inequaity, if the (deterministic) reconstruction function is g(y) = x for the x that maximizes the conditiona probabiity µ X (x Y = y), then 1 δ g 1 2 H(X Y ). Ca this g the maximum ikeihood function. Communication compexity In the two-party randomized communication compexity mode (see e.g., [43]), we have two payers Aice and Bob. Aice is given x X and Bob is given y Y, and they want to jointy compute a function f(x, y) by exchanging messages according to a protoco Π. Let Π(x, y) denote the message transcript when Aice and Bob run protoco Π on input pair (x, y). We sometimes abuse notation by identifying the protoco and the corresponding random transcript, as ong as there is no confusion. The communication compexity of a protoco is defined as the maximum number of bits exchanged among a pairs of inputs. We say a protoco Π computes f with error probabiity δ (0 δ 1) if there exists a function g such that for a input pairs (x, y), Pr[g(Π(x, y)) f(x, y)] δ. The δ-error randomized communication compexity, denoted by R δ (f), is the cost of the minimum-communication randomized protoco that computes f with error probabiity δ. The (µ, δ)-distributiona communication compexity of f, denoted by Dµ(f), δ is the cost of the minimum-communication deterministic protoco that gives the correct answer for f on at east a 1 δ fraction of a input pairs, weighted by distribution µ. Yao [53] showed that 7

8 Lemma 1 (Yao s Lemma) R δ (f) max µ D δ µ(f). Thus, one way to prove a ower bound for randomized protocos is to find a hard distribution µ and ower bound D δ µ(f). This is caed Yao s Minimax Principe. We wi use the notion expected distributiona communication compexity ED δ µ(f), which was introduced in [47] (where it was written as E[D δ µ(f)], with a bit abuse of notation) and is defined to be the expected cost (rather than the worst case cost) of the deterministic protoco that gives the correct answer for f on at east 1 δ fraction of a inputs, where the expectation is taken over distribution µ. The definitions for two-party protocos can be easiy extended to the mutiparty setting, where we have k payers and the i-th payer is given an input x i X i. Again the k payers want to jointy compute a function f(x 1, x 2,..., x k ) by exchanging messages according to a protoco Π. Information compexity Information compexity was introduced in a series of papers incuding [7, 17]. We refer the reader to Bar-Yossef s Thesis [6]; see Chapter 6 for a detaied introduction. Here we briefy review the concepts of information cost and conditiona information cost for k-payer communication probems. A of them are defined in the backboard number-in-hand mode. Let µ be an input distribution on X 1 X 2... X k and et X be a random input chosen from µ. Let Π be a randomized protoco running on inputs in X 1 X 2... X k. The information cost of Π with respect to µ is I(X; Π). The information compexity of a probem f with respect to a distribution µ and error parameter δ (0 δ 1), denoted IC δ µ(f), is the minimum information cost of a δ-error protoco for f with respect to µ. We wi work in the pubic coin mode, in which a parties aso share a common source of randomness. We say a distribution λ partitions µ if conditioned on λ, µ is a product distribution. Let X be a random input chosen from µ and D be a random variabe chosen from λ. For a randomized protoco Π on X 1 X 2... X k, the conditiona information cost of Π with respect to the distribution µ on X 1 X 2... X k and a distribution λ partitioning µ is defined as I(X; Π D). The conditiona information compexity of a probem f with respect to a distribution µ, a distribution λ partitioning µ, and error parameter δ (0 δ 1), denoted IC δ µ(f λ), is the minimum information cost of a δ-error protoco for f with respect to µ and λ. The foowing proposition can be found in [7]. Proposition 2 For any distribution µ, distribution λ partitioning µ, and error parameter δ (0 δ 1), R δ (f) IC δ µ(f) IC δ µ(f λ). Statistica distance measures Given two probabiity distributions µ and ν over the same space X, the foowing statistica distance measures wi be used in this paper: 1. Tota variation distance: TV(µ, ν) def = max A X µ(a) ν(a). 2. Heinger distance: h(µ, ν) def ( µ(x) ) 2 = x X ν(x) 1 2 We have the foowing reation between tota variation distance and Heinger distance (cf. [6], Chapter 2). Proposition 3 h 2 (µ, ν) TV(µ, ν) h(µ, ν) 2 h 2 (µ, ν). The tota variation distance of transcripts on a pair of inputs is cosey reated to the error of a randomized protoco. The foowing proposition can be found in [6], Proposition 6.22 (the origina proposition is for the 2-party case, and generaizing it to the mutiparty case is straightforward). 8

9 Proposition 4 Let 0 < δ < 1/2, and Π be a δ-error randomized protoco for a function f : X 1... X k Z. Then, for every two inputs (x 1,..., x k ), (x 1,..., x k ) X 1... X k for which f(x 1,..., x k ) f(x 1,..., x k ), it hods that TV(Π x1,...,xk, Π x 1,...,x k ) > 1 2δ. Conventions. In the rest of the paper we ca a payer a site, as to be consistent with the distributed functiona monitoring mode. We denote [n] = {1,..., n}. Let be the XOR function. A ogarithms are base-2 uness noted otherwise. We say W is a (1 + ε)-approximation of W, 0 < ε < 1, if W W (1 + ε)w. 3 A Lower Bound for F 0 We introduce a probem caed k-approx-sum, and then compose it with 2-DISJ (studied, e.g., in [48]) to prove a ower bound for F 0. In this section we work in the message-passing mode. 3.1 The k-approx-sum Probem In the k-approx-sum f,τ probem, we have k sites S 1, S 2,..., S k and the coordinator. Let f : X Y {0, 1} be an arbitrary function, and et τ be an arbitrary distribution on X Y such that for (X, Y ) τ, f(x, Y ) = 1 with probabiity β, and 0 with probabiity 1 β, where β (c β /k β 1/c β for a sufficienty arge constant c β ) is a parameter. We define the input distribution µ for k-approx-sum f,τ on {X 1,..., X k, Y } X k Y as foows: We first sampe (X 1, Y ) τ, and then independenty sampe X 2,..., X k τ Y. Note that each pair (X i, Y ) is distributed according to τ. Let Z i = f(x i, Y ). Thus Z i s are i.i.d. Bernoui(β). Let Z = {Z 1, Z 2,..., Z k }. We assign X i to site S i for each i [k], and assign Y to the coordinator. In the k-approx-sum f,τ probem, the k sites want to approximate i [k] Z i up to an additive factor of βk. In the rest of this section, for convenience, we omit subscripts f, τ in k-approx-sum f,τ, since our resuts wi hod for a f, τ having the properties mentioned above. For a fixed transcript Π = π, et q π i = Pr[Z i = 1 Π = π]. Thus i [k] qπ i = E[ i [k] Z i Π = π]. Let c 0 be a sufficienty arge constant. Definition 1 Given an input (x 1,..., x k, y) and a transcript Π = π, et z i = f(x i, y) and z = {z 1,..., z k }. For convenience, we define Π(z) Π(x 1,..., x k, y). We say 1. π is bad 1 for z (denoted by z 1 π) if Π(z) = π, and for at east 0.1 fraction of {i [k] z i = 1}, it hods that qi π β/c 0, and 2. π is bad 0 for z (denoted by z 0 π) if Π(z) = π, and for at east 0.1 fraction of {i [k] z i = 0}, it hods that q π i β/c 0. And π is good for z otherwise. In this section, we wi prove the foowing theorem. Except stated expicity, a probabiities, expectations and variances are taken with respect to the input distribution µ. Theorem 1 Let Π be the transcript of any deterministic protoco for k-approx-sum on input distribution µ with error probabiity δ for some sufficienty sma constant δ, then Pr[Π is good]

10 The foowing observation, which easiy foows from the rectange property of communication protocos, is crucia to our proof. We have incuded a proof in Appendix A. Observation 1 Conditioned on Π, Z 1, Z 2,..., Z k are independent. Definition 2 We say a transcript π is rare + if i [k] qπ i 4βk and rare if i [k] qπ i βk/4. In both cases we say π is rare. Otherwise we say it is norma. Definition 3 We say Z = {Z 1, Z 2,..., Z k } is a joker + if i [k] Z i 2βk, and a joker if i [k] Z i βk/2. In both cases we say Z is a joker. Lemma 2 Under the assumption of Theorem 1, Pr[Π is norma] Proof: First, we can appy a Chernoff bound on random variabes Z 1,..., Z k, and get Pr[Z is a joker + ] = Pr Z i 2βk e βk/3. i [k] Second, by Observation 1, we can appy a Chernoff bound on random variabes Z 1,..., Z k conditioned on Π being rare +, Pr[Z is a joker + Π is rare + ] π = π = π Pr [ Π = π Π is rare +] Pr [ Z is a joker + Π = π, Π is rare +] Pr [ Π = π Π is rare +] Pr Z i 2βk i [k] Pr [ Π = π Π is rare +] ( 1 e βk/2) ( 1 e βk/2). Finay by Bayes theorem, we have that i [k] q π i 4βk, Π = π Pr[Π is rare + ] = Pr[Z is a joker+ ] Pr[Π is rare + Z is a joker + ] Pr[Z is a joker + Π is rare + ] e βk/3 1 e βk/2 2e βk/3. Simiary, we can aso show that Pr[Π is rare ] 2e βk/8. Therefore Pr[Π is rare] 4e βk/ (reca that by our assumption βk c β for a sufficienty arge constant c β ). Definition 4 Let c = 40c 0. We say a transcript π is weak if i [k] qπ i (1 qπ i ) βk/c, and strong otherwise. Lemma 3 Under the assumption of Theorem 1, Pr[Π is norma and strong]

11 Proof: We first show that for a norma and weak transcript π, there exists a constant δ = δ (c ) such that Pr Z i qi π + 2 βk i [k] i [k] Π = π δ, (1) and Pr Z i qi π + 4 βk i [k] i [k] Π = π δ. (2) The first inequaity is a simpe appication of Chernoff-Hoeffding bound. Reca that for a norma π, 4βk. We have i [k] qπ i Pr Z i qi π + 2 βk Π = π, Π is norma i [k] i [k] 1 Pr Z i qi π + 2 βk Π = π, Π is norma i [k] i [k] 1 e 8 βk 2 i [k] qπ i 1 e 2 δ. (for a sufficienty sma constant δ ) Now we prove for the second inequaity. We wi need the foowing anti-concentration resut which is an easy consequence of Feer [26] (cf. [46]). Fact 1 ([46]) Let Y be a sum of independent random variabes, each attaining vaues in [0, 1], and et σ = Var[Y ] 200. Then for a t [0, σ 2 /100], we have for a universa constant c > 0. For a norma and weak Π = π, it hods that Pr[Y E[Y ] + t] c e t2 /(3σ 2 ) Var Z i Π = π = Var [Z i Π = π] (by observation 1) i [k] i [k] = i [k] q π i (1 q π i ) βk/c. (by definition of a weak π) Reca that by our assumption, βk c β for a sufficienty arge constant c β, thus βk βk/(100c ) and βk/c Using Lemma 1, we have for a universa constant c, Pr Z i qi π + 4 βk Π = π, Π is weak i [k] i [k] c e (4 βk) 2 3βk/c c e 16c/3 δ. (for a sufficienty sma constant δ ) 11

12 By (1) and (2), it is easy to see that given that Π is norma, it cannot be weak with probabiity more than 0.01, since otherwise by Lemma 2 and the anaysis above, the error probabiity of the protoco wi be at east δ > δ, for an arbitrariy sma constant error δ, vioating the success guarantee of the emma. Therefore, Pr[Π is norma and strong] Pr[Π is norma] Pr[Π is strong Π is norma] Now we anayze the probabiity of Π being good. For a Z = z, et H 0 (z) = {i z i H 1 (z) = {i z i = 1}. We have the foowing two emmas. = 0} and Lemma 4 Under the assumption of Theorem 1, Pr[Π is bad 0 Π is norma and strong] Proof: Consider any Z = z. First, by the definition of a norma π, we have i:z i =0 qπ i i [k] qπ i 4βk. Therefore the number of i s such that z i = 0 and qi π > (1 β/c 0 ) is at most 4βk/(1 β/c 0 ) 8βk. Second, by the definition of a strong π, we have i:z i =0 qπ i (1 qπ i ) i [k] qπ i (1 qπ i ) βk/c. Therefore the number of i s such that z i = 0 and β/c 0 qi π βk/c (1 β/c 0 ) is at most β/c 0 (1 β/c 0 ) 0.05k (c = 40c 0 ). Aso note that if z is not joker, then H 0 (z) (k 2βk). Thus conditioned on a norma and strong π, as we as z is not a joker, the number of i s such that z i = 0 and qi π < β/c 0 is at east (k 2βk) 8βk 0.05k > 0.9k 0.9 H 0 (z), where we have used our assumption that β 1/c β for a sufficienty arge constant c β. We concude that Pr[Π is bad 0 Π is norma and strong] Pr[Z is a joker] 2e βk/ Lemma 5 Under the assumption of Theorem 1, Pr[Π is bad 1 Π is norma] Proof: have qi π Ca a π is bad 1 for a set T [k] (denoted by T 1 π), if for more than 0.1 fraction of i T, we β/c 0. Let χ(e) = 1 if E hods and χ(e) = 0 otherwise. We have = π Pr[Π is bad 1 Π is norma] Pr[Π = π Π is norma] z Pr[Z = z Π = π, Π is norma] χ(z 1 π) Pr[Z is a joker] + Pr[Π = π Π is norma] π Pr[Z = z Π = π, Π is norma] χ(h 1 (z) = T ) χ(t 1 π) (3) [βk/2,2βk] T [k]: T = Pr[Z is a joker] + π [βk/2,2βk] z Pr[Π = π Π is norma] T [k]: T = T 1 π i T q π i Π = π, Π is norma (4) 12

13 The ast inequaity hods since in (4), in the ast term, we count the probabiity of each possibe set T of size and is 1 to π that its eements are a 1, which upper bounds the corresponding summation in (3). Now for a fixed, conditioned on a norma π, we consider the term qi π. (5) T [k]: T = T 1 π i T W..o.g., we can assume that q1 π... qπ s > β/c 0 qs+1 π... qπ k for an s = κ sk (0 < κ s 1). We consider a pair (qu, π qv π ) (u, v [k]). Terms in the summation (5) that incudes either qu π or qv π can be written as qi π + qv π qi π + quq π v π qi π. q π u T [k]: T = T 1 π u T,v T i T \u T [k]: T = T 1 π v T,u T i T \v T [k]: T = T 1 π v T,u T i T \v,u By the symmetry of qu, π qv π, the sets {T \u T [k], T =, T 1 π, u T, v T } and {T \v T [k], T =, T 1 π, v T, u T } are the same. Using this fact and the AM-GM inequaity, it is easy to see that the sum wi not decrease if we set (qu) π = (qv π ) = (qu π + qv π )/2. Ca such an operation an equaization. We repeat appying such equaizations to any pair (qu, π qv π ), with the constraint that if u [1, s] and v [s + 1, k], then we ony average them to the extent that (qu) π = β/c 0, (qv π ) = qu π + qv π β/c 0 if qu π + qv π 2β/c 0, and (qv π ) = β/c 0, (qu) π = qu π + qv π β/c 0 otherwise. We introduce this constraint because we do not want to change {i (qi π) β/c 0 }, since otherwise a set T which was originay 1 Π can be 1 Π after these equaizations. We cannot further appy equaizations when one of the foowings happen. (q π 1 ) =... = (q π s ) > β/c 0 = (q π s+1) =... = (q π k ). (6) (q π 1 ) =... = (q π s ) = β/c 0 (q π s+1) =... = (q π k ). (7) We note that actuay (7) cannot happen since i [k] (qπ i ) = i [k] qπ i is preserved during equaizations, and conditioned on a norma π, we have i [k] qπ i βk/4 > βk/c 0. Let q = (q1 π) =... = (qs π ). For a norma π, it hods that i [k] (qπ i ) = s q + (k s) β/c 0 = r [βk/4, 4βk]. Let α (0.1, 1]. Reca that [βk/2, 2βk], and we have set s = κ s k. We try to upper bound (5) using (6). (( ) ( )) ( ( ) k s qi π s β α ( ) ) r (k s)β (1 α). (8) α (1 α) s c 0 s T [k]: T = T 1 π i T c 0 ( (e(1 ) κs )k α ( ) ) ( eκs k (1 α) ( ) β α ( ) ) r (1 α) α (1 α) c 0 κ s k ( e βk ) α ( ) er (1 α) αc 0 (1 α) ( ) 8e (c 0 ) α α α (1 α) 1 α 13

14 ( ) 8e βk/2 (c 0 ) 0.1 (1/e) 2/e (9) In (8), the first term is the number of possibe choices of the set T (T = ) with α fraction of items in [s + 1, ], and the rest in [1, s]. And the second term upper bounds i T qπ i according to the discussion above. Here we have assumed α < 1, otherwise if α = 1, then (8) ( k ) (β/c0 ) (2e/c 0 ) βk/2, which is smaer than (9). Now, (4) can be upper bounded by 2e βk/8 + π ( = 2e βk/8 + 2βk ( Pr[Π = π Π is norma] 2βk ) βk/2 8e (c 0 ) 0.1 (1/e) 2/e (for a sufficienty arge constant c 0 ) 8e (c 0 ) 0.1 (1/e) 2/e ) βk/2 Finay, combining Lemma 3, Lemma 4 and Lemma 5, we get Pr[Π is good] Pr[Π is good, norma and strong] = Pr[Π is norma and strong](1 Pr[Π is bad 0 Π is norma and strong] 3.2 The 2-DISJ Probem Pr[Π is bad 1 Π is norma and strong]) Pr[Π is norma and strong](1 Pr[Π is bad 0 Π is norma and strong]) Pr[Π is norma] Pr[Π is bad 1 Π is norma] 0.98 (1 0.01) In 2-DISJ probem, Aice has a set x [n] and Bob has a set y [n]. Their goa is to output 1 if x y, and 0 otherwise. We define the input distribution τ β as foows. Let = (n + 1)/4. With probabiity β, x and y are random subsets of [n] such that x = y = and x y = 1. And with probabiity 1 β, x and y are random subsets of [n] such that x = y = and x y =. Razborov [48] proved that for β = 1/4, Dτ 1/(400) 1/4 (2-DISJ) = Ω(n). It is easy to extend this resut to genera β and the average-case compexity. Theorem 2 ([47], Lemma 2.2) For any β 1/4, it hods that ED β/100 τ β (2-DISJ) = Ω(n), where the expectation is taken over the input distribution τ β. In the rest of the section, we simpy write τ β as τ. 3.3 The Compexity of F Connecting F 0 and k-approx-sum 2-DISJ,τ Set β = 1/(kε 2 ), B = 20000/δ, where δ is the sma constant error parameter for k-approx-sum in Theorem 1. 14

15 We choose f to be 2-DISJ with universe size n = B/ε 2, set its input distribution to be τ, and work on k-approx-sum 2-DISJ,τ. Let µ be the input distribution of k-approx-sum 2-DISJ,τ, which is a function of τ (see Section 3.1 for the detaied construction of µ from τ). Let {X 1,..., X k, Y } µ. Let Z i = 2-DISJ(X i, Y ). Let ζ be the induced distribution of µ on {X 1,..., X k } which we choose to be the input distribution for F 0. In the rest of this section, for convenience, we wi omit the subscripts 2-DISJ and τ in k-approx-sum 2-DISJ,τ when there is no confusion. Let N = i [k] Z i = i [k] 2-DISJ(X i, Y ). Let R = F 0 ( i [k] X i Y ). The foowing emma shows that R wi concentrate around its expectation E[R], which can be cacuated exacty. Lemma 6 With probabiity at east (1 6500/B), we have R E[R] 1/(10ε), where E[R] = (1 λ)n for some fixed constant 0 λ 4/B. Proof: We can think of our probem as a bin-ba game: Think each pair (X i, Y ) such that 2-DISJ(X i, Y ) = 1 are bas (thus we have N bas), and eements in the set Y are bins. Let = Y. We throw each of the N bas into one of the bins uniformy at random. Our goa is to estimate the number of non-empty bins at the end of the process. By a Chernoff bound, with probabiity ( 1 e βk/3) (1 100/B), N 2βk = 2/ε 2. By Fact 1 and Lemma 1 in [41], we have E[R] = ( 1 (1 1/) N) and Var[R] < 4N 2 /. Thus by Chebyshev s inequaity we have Pr[ R E[R] > 1/(10ε)] Var[R] 1/(100ε 2 ) 6400 B. Let θ = N/ 8/B. We can write ( E[R] = 1 e θ) + O(1) = θ (1 ) θ2 + θ2! 3! θ3 4! + + O(1). This series converges and thus we can write E[R] = (1 λ)θ = (1 λ)n for some fixed constant 0 λ θ/2 4/B. The next emma shows that we can use a protoco for F 0 to sove k-approx-sum with good properties. Lemma 7 Any protoco P that computes a (1 + γε)-approximation to F 0 (for a sufficienty sma constant γ) on input distribution ζ with error probabiity δ/2 can be used to compute k-approx-sum 2-DISJ,τ on input distribution µ with error probabiity δ. Proof: Given an input {X 1,..., X k, Y } µ for k-approx-sum. The k sites and the coordinator use P to compute W which is a (1 + γε)-approximation to F 0 (X 1,..., X k ), and then determine the answer to k-approx-sum to be W (n ). 1 λ Reca that 0 λ 4/B is some fixed constant, n = B/ε 2 and = (n + 1)/4. Correctness. Given a random input (X 1,..., X k, Y ) ζ, the exact vaue of W = F 0 (X 1,..., X k ) can be written as the sum of two components. W = Q + R, (10) 15

16 where Q counts F 0 ( i [k] X i \Y ), and R counts F 0 ( i [k] X i Y ). First, from our construction it is easy to see by a Chernoff bound and the union bound that with probabiity ( 1 1/ε 2 e Ω(k)) 1 100/B, we have Q = {[n] Y } = n, since each eement in S\Y wi be chosen by every X i (i = 1, 2,..., k) with a probabiity at east 1/4. Second, by Lemma 6 we know that with probabiity (1 6500/B), R is within 1/(10ε) from its mean (1 λ)n for some fixed constant 0 λ 4/B. Thus with probabiity (1 6600/B), we can write Equation (10) as W = (n ) + (1 λ)n + κ 1, (11) for a vaue κ 1 1/(10ε) and N 2/ε 2. Set γ = 1/(20B). Since F 0 (X 1, X 2,..., X k ) computes a vaue W which is a (1 + γε)-approximation of W, we can substitute W with W in Equation (11), resuting in the foowing. where κ 1 1/(10ε), N 2/ε 2, and Now we have W = (n ) + (1 λ)n + κ 1 + κ 2, (12) κ 2 γε W = γε ((n ) + (1 λ)n + κ 1 ) γε (B/ε 2 + 2/ε 2 + 1/(10ε)) 1/(10ε). N = ( W (n ) κ 1 κ 2 )/(1 λ) = ( W (n ))/(1 λ) + κ 3, where κ 3 (1/(10ε) + 1/(10ε))/(1 4/B) 1/(4ε). Therefore ( W (n ))/(1 λ) approximates N = i [k] Z i correcty up to an additive error 1/(4ε) < βk = 1/ε, thus computes k-approx-sum correcty. The tota error probabiity of this simuation is at most (δ/2+6600/b), where the first term counts the error probabiity of P and the second term counts the error probabiity introduced by the reduction. This is ess than δ if we choose B = 20000/δ An Embedding Argument Lemma 8 Suppose that there exists a deterministic protoco P which computes (1 + γε)-approximate F 0 (for a sufficienty sma constant γ) on input distribution ζ with error probabiity δ/2 (for a sufficienty sma constant δ) and communication o(c), then there exists a deterministic protoco P that computes 2-DISJ on input distribution τ with error probabiity β/100 and expected communication compexity o(og(1/β) C/k), where the expectation is taken over the input distribution τ. Proof: In 2-DISJ, Aice hods X and Bob hods Y such that (X, Y ) τ. We show that Aice and Bob can use the deterministic protoco P to construct a deterministic protoco P for 2-DISJ(X, Y ) with desired error probabiity and communication compexity. Aice and Bob first use P to construct a protoco P. During the construction they wi use pubic and private randomness which wi be fixed at the end. P consists of two phases. 16

17 Input reduction phase. Aice and Bob construct an input for F 0 using X and Y as foows: They pick a random site S I (I [k]) using pubic randomness. Aice assigns S I with input X I = X, and Bob constructs inputs for the rest (k 1) sites using Y. For each i [k]\i, Bob sampes an X i according to τ Y using independent private randomness and assigns it to S i. Let Z i = 2-DISJ(X i, Y ). Note that {X 1,..., X k, Y } µ and {X 1,..., X k } ζ. Simuation phase. Aice simuates S I and Bob simuates the rest (k 1) sites, and they run protoco P on {X 1,..., X k } ζ to compute F 0 (X 1,..., X k ) up to a (1 + γε)-approximation for a sufficienty sma constant γ and error probabiity δ/2. Let π be the protoco transcript, and et W be the output. By Lemma 7, we can use W to compute k-approx-sum with error probabiity δ. And then by Theorem 1, for 0.96 fraction of Z = z over the input distribution µ and π = Π(z), it hods that for 0.9 fraction of {i [k] z i = 0}, qi π < β/c 0, and 0.9 fraction of {i [k] z i = 1}, qi π > β/c 0. Now P outputs 1 if qi π > β/c 0, and 0 otherwise. Since S I is chosen randomy among the k sites, and the inputs for the k sites are identicay distributed, P computes Z I = 2-DISJ(X, Y ) on input distribution τ correcty with probabiity We now describe the fina protoco P: Aice and Bob repeat P independenty for c R og(1/β) times for a arge enough constant c R. At the j-th repetition, in the input reduction phase, they choose a random permutation σ j of [n] using pubic randomness, and appy it to each eement in X 1,..., X k before assigning them to the k sites. After running P for c R og(1/β) times, P outputs the majority of the outcomes. Since Z I = 2-DISJ(X, Y) is fixed at each repetition, the inputs {X 1,..., X k } at each repetition have a sma dependence, but conditioned on Z I, they are a independent. Let µ to be input distribution of {X 1,..., X k, Y } conditioned on Z I = b. Let ζ be the induced distribution of µ on {X 1,..., X k }. The successfu probabiity of a run of P on ζ is at east 0.8 TV(ζ, ζ ), where TV(ζ, ζ ) is the tota variation distance between distributions ζ, ζ, which is at most max{tv(binomia(k, β), Binomia(k 1, β)), TV(Binomia(k, β), Binomia(k 1, β) + 1)}, and can be bounded by O(1/ βk) = O(ε) (see, e.g., Fact 2.4 of [30]). Since conditioned on Z I, the inputs at each repetition are independent, and the success probabiity of each run of P is at east 0.7, by a Chernoff bound over the c R og(1/β) repetitions for a sufficienty arge c R, we concude that P succeeds with error probabiity β/1600. We next consider the communication compexity. At each run of P, et CC(S I, S I ) be the expected communication cost between the site S I and the rest payers (more precisey, between S I and the coordinator, since in the coordinator mode a sites ony tak to the coordinator, whose initia input is ), where the expectation is taken over the input distribution ζ and the choice of the random I [k]. Since conditioned on Y, a X i (i [k]) are independent and identicay distributed, if we take a random site S I, the expected communication between S I and the coordinator shoud be equa to the tota communication divided by a factor of k. Thus we have CC(S I, S I ) = o(c/k). Finay, by the inearity of expectation, the expected tota communication cost of the O(og(1/β)) runs of P is o(og(1/β) C/k). At the end we fix a the randomness used in construction of protoco P. We first use two Markov inequaities to fix a pubic randomness such that P succeeds with error probabiity β/400, and the expected tota communication cost of the o(og(1/β)c/k), where both the error probabiity and the cost expectation are taken over the input distribution µ and Bob s private randomness. We next use another two Markov inequaities to fix Bob s private randomness such that P succeeds with error probabiity β/100, and the expected tota communication cost of the o(og(1/β)c/k), where both the error probabiity and the cost expectation are taken over the input distribution µ. The foowing theorem is a direct consequence of Lemma 8, Theorem 2 for 2-DISJ and Lemma 1 (Yao s 17

Tight Bounds for Distributed Functional Monitoring

Tight Bounds for Distributed Functional Monitoring Tight Bounds for Distributed Functional Monitoring Qin Zhang MADALGO, Aarhus University Joint with David Woodruff, IBM Almaden NII Shonan meeting, Japan Jan. 2012 1-1 The distributed streaming model (a.k.a.

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view

More information

Tight Bounds for Distributed Streaming

Tight Bounds for Distributed Streaming Tight Bounds for Distributed Streaming (a.k.a., Distributed Functional Monitoring) David Woodruff IBM Research Almaden Qin Zhang MADALGO, Aarhus Univ. STOC 12 May 22, 2012 1-1 The distributed streaming

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

Lower Bound Techniques for Multiparty Communication Complexity

Lower Bound Techniques for Multiparty Communication Complexity Lower Bound Techniques for Multiparty Communication Complexity Qin Zhang Indiana University Bloomington Based on works with Jeff Phillips, Elad Verbin and David Woodruff 1-1 The multiparty number-in-hand

More information

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is

More information

Mat 1501 lecture notes, penultimate installment

Mat 1501 lecture notes, penultimate installment Mat 1501 ecture notes, penutimate instament 1. bounded variation: functions of a singe variabe optiona) I beieve that we wi not actuay use the materia in this section the point is mainy to motivate the

More information

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After

More information

A proposed nonparametric mixture density estimation using B-spline functions

A proposed nonparametric mixture density estimation using B-spline functions A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),

More information

XSAT of linear CNF formulas

XSAT of linear CNF formulas XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open

More information

Bourgain s Theorem. Computational and Metric Geometry. Instructor: Yury Makarychev. d(s 1, s 2 ).

Bourgain s Theorem. Computational and Metric Geometry. Instructor: Yury Makarychev. d(s 1, s 2 ). Bourgain s Theorem Computationa and Metric Geometry Instructor: Yury Makarychev 1 Notation Given a metric space (X, d) and S X, the distance from x X to S equas d(x, S) = inf d(x, s). s S The distance

More information

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7 6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the

More information

Efficiently Generating Random Bits from Finite State Markov Chains

Efficiently Generating Random Bits from Finite State Markov Chains 1 Efficienty Generating Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

STA 216 Project: Spline Approach to Discrete Survival Analysis

STA 216 Project: Spline Approach to Discrete Survival Analysis : Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing

More information

Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channels

Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channels Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channes arxiv:cs/060700v1 [cs.it] 6 Ju 006 Chun-Hao Hsu and Achieas Anastasopouos Eectrica Engineering and Computer Science Department University

More information

Explicit overall risk minimization transductive bound

Explicit overall risk minimization transductive bound 1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,

More information

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna

More information

Unconditional security of differential phase shift quantum key distribution

Unconditional security of differential phase shift quantum key distribution Unconditiona security of differentia phase shift quantum key distribution Kai Wen, Yoshihisa Yamamoto Ginzton Lab and Dept of Eectrica Engineering Stanford University Basic idea of DPS-QKD Protoco. Aice

More information

Efficient Generation of Random Bits from Finite State Markov Chains

Efficient Generation of Random Bits from Finite State Markov Chains Efficient Generation of Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

Asynchronous Control for Coupled Markov Decision Systems

Asynchronous Control for Coupled Markov Decision Systems INFORMATION THEORY WORKSHOP (ITW) 22 Asynchronous Contro for Couped Marov Decision Systems Michae J. Neey University of Southern Caifornia Abstract This paper considers optima contro for a coection of

More information

AST 418/518 Instrumentation and Statistics

AST 418/518 Instrumentation and Statistics AST 418/518 Instrumentation and Statistics Cass Website: http://ircamera.as.arizona.edu/astr_518 Cass Texts: Practica Statistics for Astronomers, J.V. Wa, and C.R. Jenkins, Second Edition. Measuring the

More information

arxiv: v1 [cs.db] 25 Jun 2013

arxiv: v1 [cs.db] 25 Jun 2013 Communication Steps for Parae Query Processing Pau Beame, Paraschos Koutris and Dan Suciu {beame,pkoutris,suciu}@cs.washington.edu University of Washington arxiv:1306.5972v1 [cs.db] 25 Jun 2013 June 26,

More information

Rate-Distortion Theory of Finite Point Processes

Rate-Distortion Theory of Finite Point Processes Rate-Distortion Theory of Finite Point Processes Günther Koiander, Dominic Schuhmacher, and Franz Hawatsch, Feow, IEEE Abstract We study the compression of data in the case where the usefu information

More information

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING 8 APPENDIX 8.1 EMPIRICAL EVALUATION OF SAMPLING We wish to evauate the empirica accuracy of our samping technique on concrete exampes. We do this in two ways. First, we can sort the eements by probabiity

More information

#A48 INTEGERS 12 (2012) ON A COMBINATORIAL CONJECTURE OF TU AND DENG

#A48 INTEGERS 12 (2012) ON A COMBINATORIAL CONJECTURE OF TU AND DENG #A48 INTEGERS 12 (2012) ON A COMBINATORIAL CONJECTURE OF TU AND DENG Guixin Deng Schoo of Mathematica Sciences, Guangxi Teachers Education University, Nanning, P.R.China dengguixin@ive.com Pingzhi Yuan

More information

4 Separation of Variables

4 Separation of Variables 4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE

More information

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating

More information

Problem set 6 The Perron Frobenius theorem.

Problem set 6 The Perron Frobenius theorem. Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator

More information

C. Fourier Sine Series Overview

C. Fourier Sine Series Overview 12 PHILIP D. LOEWEN C. Fourier Sine Series Overview Let some constant > be given. The symboic form of the FSS Eigenvaue probem combines an ordinary differentia equation (ODE) on the interva (, ) with a

More information

FOURIER SERIES ON ANY INTERVAL

FOURIER SERIES ON ANY INTERVAL FOURIER SERIES ON ANY INTERVAL Overview We have spent considerabe time earning how to compute Fourier series for functions that have a period of 2p on the interva (-p,p). We have aso seen how Fourier series

More information

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm 1 Asymptotic Properties of a Generaized Cross Entropy Optimization Agorithm Zijun Wu, Michae Koonko, Institute for Appied Stochastics and Operations Research, Caustha Technica University Abstract The discrete

More information

Target Location Estimation in Wireless Sensor Networks Using Binary Data

Target Location Estimation in Wireless Sensor Networks Using Binary Data Target Location stimation in Wireess Sensor Networks Using Binary Data Ruixin Niu and Pramod K. Varshney Department of ectrica ngineering and Computer Science Link Ha Syracuse University Syracuse, NY 344

More information

Some Measures for Asymmetry of Distributions

Some Measures for Asymmetry of Distributions Some Measures for Asymmetry of Distributions Georgi N. Boshnakov First version: 31 January 2006 Research Report No. 5, 2006, Probabiity and Statistics Group Schoo of Mathematics, The University of Manchester

More information

Stochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract

Stochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract Stochastic Compement Anaysis of Muti-Server Threshod Queues with Hysteresis John C.S. Lui The Dept. of Computer Science & Engineering The Chinese University of Hong Kong Leana Goubchik Dept. of Computer

More information

Fast Blind Recognition of Channel Codes

Fast Blind Recognition of Channel Codes Fast Bind Recognition of Channe Codes Reza Moosavi and Erik G. Larsson Linköping University Post Print N.B.: When citing this work, cite the origina artice. 213 IEEE. Persona use of this materia is permitted.

More information

Limits on Support Recovery with Probabilistic Models: An Information-Theoretic Framework

Limits on Support Recovery with Probabilistic Models: An Information-Theoretic Framework Limits on Support Recovery with Probabiistic Modes: An Information-Theoretic Framewor Jonathan Scarett and Voan Cevher arxiv:5.744v3 cs.it 3 Aug 6 Abstract The support recovery probem consists of determining

More information

A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC

A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC (January 8, 2003) A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC DAMIAN CLANCY, University of Liverpoo PHILIP K. POLLETT, University of Queensand Abstract

More information

SydU STAT3014 (2015) Second semester Dr. J. Chan 18

SydU STAT3014 (2015) Second semester Dr. J. Chan 18 STAT3014/3914 Appied Stat.-Samping C-Stratified rand. sampe Stratified Random Samping.1 Introduction Description The popuation of size N is divided into mutuay excusive and exhaustive subpopuations caed

More information

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike

More information

8 Digifl'.11 Cth:uits and devices

8 Digifl'.11 Cth:uits and devices 8 Digif'. Cth:uits and devices 8. Introduction In anaog eectronics, votage is a continuous variabe. This is usefu because most physica quantities we encounter are continuous: sound eves, ight intensity,

More information

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

Week 6 Lectures, Math 6451, Tanveer

Week 6 Lectures, Math 6451, Tanveer Fourier Series Week 6 Lectures, Math 645, Tanveer In the context of separation of variabe to find soutions of PDEs, we encountered or and in other cases f(x = f(x = a 0 + f(x = a 0 + b n sin nπx { a n

More information

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries c 26 Noninear Phenomena in Compex Systems First-Order Corrections to Gutzwier s Trace Formua for Systems with Discrete Symmetries Hoger Cartarius, Jörg Main, and Günter Wunner Institut für Theoretische

More information

Testing for the Existence of Clusters

Testing for the Existence of Clusters Testing for the Existence of Custers Caudio Fuentes and George Casea University of Forida November 13, 2008 Abstract The detection and determination of custers has been of specia interest, among researchers

More information

II. PROBLEM. A. Description. For the space of audio signals

II. PROBLEM. A. Description. For the space of audio signals CS229 - Fina Report Speech Recording based Language Recognition (Natura Language) Leopod Cambier - cambier; Matan Leibovich - matane; Cindy Orozco Bohorquez - orozcocc ABSTRACT We construct a rea time

More information

Error-free Multi-valued Broadcast and Byzantine Agreement with Optimal Communication Complexity

Error-free Multi-valued Broadcast and Byzantine Agreement with Optimal Communication Complexity Error-free Muti-vaued Broadcast and Byzantine Agreement with Optima Communication Compexity Arpita Patra Department of Computer Science Aarhus University, Denmark. arpita@cs.au.dk Abstract In this paper

More information

FRIEZE GROUPS IN R 2

FRIEZE GROUPS IN R 2 FRIEZE GROUPS IN R 2 MAXWELL STOLARSKI Abstract. Focusing on the Eucidean pane under the Pythagorean Metric, our goa is to cassify the frieze groups, discrete subgroups of the set of isometries of the

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness 1 Scheduabiity Anaysis of Deferrabe Scheduing Agorithms for Maintaining Rea-Time Data Freshness Song Han, Deji Chen, Ming Xiong, Kam-yiu Lam, Aoysius K. Mok, Krithi Ramamritham UT Austin, Emerson Process

More information

arxiv: v1 [math.co] 17 Dec 2018

arxiv: v1 [math.co] 17 Dec 2018 On the Extrema Maximum Agreement Subtree Probem arxiv:1812.06951v1 [math.o] 17 Dec 2018 Aexey Markin Department of omputer Science, Iowa State University, USA amarkin@iastate.edu Abstract Given two phyogenetic

More information

Partial permutation decoding for MacDonald codes

Partial permutation decoding for MacDonald codes Partia permutation decoding for MacDonad codes J.D. Key Department of Mathematics and Appied Mathematics University of the Western Cape 7535 Bevie, South Africa P. Seneviratne Department of Mathematics

More information

arxiv: v2 [cond-mat.stat-mech] 14 Nov 2008

arxiv: v2 [cond-mat.stat-mech] 14 Nov 2008 Random Booean Networks Barbara Drosse Institute of Condensed Matter Physics, Darmstadt University of Technoogy, Hochschustraße 6, 64289 Darmstadt, Germany (Dated: June 27) arxiv:76.335v2 [cond-mat.stat-mech]

More information

Emmanuel Abbe Colin Sandon

Emmanuel Abbe Colin Sandon Detection in the stochastic bock mode with mutipe custers: proof of the achievabiity conjectures, acycic BP, and the information-computation gap Emmanue Abbe Coin Sandon Abstract In a paper that initiated

More information

arxiv: v1 [cs.ds] 12 Nov 2018

arxiv: v1 [cs.ds] 12 Nov 2018 Quantum-inspired ow-rank stochastic regression with ogarithmic dependence on the dimension András Giyén 1, Seth Loyd Ewin Tang 3 November 13, 018 arxiv:181104909v1 [csds] 1 Nov 018 Abstract We construct

More information

Cryptanalysis of PKP: A New Approach

Cryptanalysis of PKP: A New Approach Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in

More information

c 2007 Society for Industrial and Applied Mathematics

c 2007 Society for Industrial and Applied Mathematics SIAM REVIEW Vo. 49,No. 1,pp. 111 1 c 7 Society for Industria and Appied Mathematics Domino Waves C. J. Efthimiou M. D. Johnson Abstract. Motivated by a proposa of Daykin [Probem 71-19*, SIAM Rev., 13 (1971),

More information

$, (2.1) n="# #. (2.2)

$, (2.1) n=# #. (2.2) Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier

More information

Algorithms to solve massively under-defined systems of multivariate quadratic equations

Algorithms to solve massively under-defined systems of multivariate quadratic equations Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations

More information

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness 1 Scheduabiity Anaysis of Deferrabe Scheduing Agorithms for Maintaining Rea- Data Freshness Song Han, Deji Chen, Ming Xiong, Kam-yiu Lam, Aoysius K. Mok, Krithi Ramamritham UT Austin, Emerson Process Management,

More information

Committed MPC. Maliciously Secure Multiparty Computation from Homomorphic Commitments. 1 Introduction

Committed MPC. Maliciously Secure Multiparty Computation from Homomorphic Commitments. 1 Introduction Committed MPC Maiciousy Secure Mutiparty Computation from Homomorphic Commitments Tore K. Frederiksen 1, Benny Pinkas 2, and Avishay Yanai 2 1 Security Lab, Aexandra Institute, Denmark 2 Department of

More information

c 2016 Georgios Rovatsos

c 2016 Georgios Rovatsos c 2016 Georgios Rovatsos QUICKEST CHANGE DETECTION WITH APPLICATIONS TO LINE OUTAGE DETECTION BY GEORGIOS ROVATSOS THESIS Submitted in partia fufiment of the requirements for the degree of Master of Science

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai

More information

4 1-D Boundary Value Problems Heat Equation

4 1-D Boundary Value Problems Heat Equation 4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x

More information

Coded Caching for Files with Distinct File Sizes

Coded Caching for Files with Distinct File Sizes Coded Caching for Fies with Distinct Fie Sizes Jinbei Zhang iaojun Lin Chih-Chun Wang inbing Wang Department of Eectronic Engineering Shanghai Jiao ong University China Schoo of Eectrica and Computer Engineering

More information

On the estimation of multiple random integrals and U-statistics

On the estimation of multiple random integrals and U-statistics Péter Major On the estimation of mutipe random integras and U-statistics Lecture Note January 9, 2014 Springer Contents 1 Introduction................................................... 1 2 Motivation

More information

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this

More information

A Statistical Framework for Real-time Event Detection in Power Systems

A Statistical Framework for Real-time Event Detection in Power Systems 1 A Statistica Framework for Rea-time Event Detection in Power Systems Noan Uhrich, Tim Christman, Phiip Swisher, and Xichen Jiang Abstract A quickest change detection (QCD) agorithm is appied to the probem

More information

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel Sequentia Decoding of Poar Codes with Arbitrary Binary Kerne Vera Miosavskaya, Peter Trifonov Saint-Petersburg State Poytechnic University Emai: veram,petert}@dcn.icc.spbstu.ru Abstract The probem of efficient

More information

BALANCING REGULAR MATRIX PENCILS

BALANCING REGULAR MATRIX PENCILS BALANCING REGULAR MATRIX PENCILS DAMIEN LEMONNIER AND PAUL VAN DOOREN Abstract. In this paper we present a new diagona baancing technique for reguar matrix pencis λb A, which aims at reducing the sensitivity

More information

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity

More information

On the Goal Value of a Boolean Function

On the Goal Value of a Boolean Function On the Goa Vaue of a Booean Function Eric Bach Dept. of CS University of Wisconsin 1210 W. Dayton St. Madison, WI 53706 Lisa Heerstein Dept of CSE NYU Schoo of Engineering 2 Metrotech Center, 10th Foor

More information

An Infeasibility Result for the Multiterminal Source-Coding Problem

An Infeasibility Result for the Multiterminal Source-Coding Problem An Infeasibiity Resut for the Mutitermina Source-Coding Probem Aaron B. Wagner, Venkat Anantharam, November 22, 2005 Abstract We prove a new outer bound on the rate-distortion region for the mutitermina

More information

Melodic contour estimation with B-spline models using a MDL criterion

Melodic contour estimation with B-spline models using a MDL criterion Meodic contour estimation with B-spine modes using a MDL criterion Damien Loive, Ney Barbot, Oivier Boeffard IRISA / University of Rennes 1 - ENSSAT 6 rue de Kerampont, B.P. 80518, F-305 Lannion Cedex

More information

Lower Bounds for Number-in-Hand Multiparty Communication Complexity, Made Easy

Lower Bounds for Number-in-Hand Multiparty Communication Complexity, Made Easy Lower Bounds for Number-in-Hand Multiparty Communication Complexity, Made Easy Jeff M. Phillips School of Computing University of Utah jeffp@cs.utah.edu Elad Verbin Dept. of Computer Science Aarhus University,

More information

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION Hsiao-Chang Chen Dept. of Systems Engineering University of Pennsyvania Phiadephia, PA 904-635, U.S.A. Chun-Hung Chen

More information

Tight Approximation Algorithms for Maximum Separable Assignment Problems

Tight Approximation Algorithms for Maximum Separable Assignment Problems MATHEMATICS OF OPERATIONS RESEARCH Vo. 36, No. 3, August 011, pp. 416 431 issn 0364-765X eissn 156-5471 11 3603 0416 10.187/moor.1110.0499 011 INFORMS Tight Approximation Agorithms for Maximum Separabe

More information

Manipulation in Financial Markets and the Implications for Debt Financing

Manipulation in Financial Markets and the Implications for Debt Financing Manipuation in Financia Markets and the Impications for Debt Financing Leonid Spesivtsev This paper studies the situation when the firm is in financia distress and faces bankruptcy or debt restructuring.

More information

(f) is called a nearly holomorphic modular form of weight k + 2r as in [5].

(f) is called a nearly holomorphic modular form of weight k + 2r as in [5]. PRODUCTS OF NEARLY HOLOMORPHIC EIGENFORMS JEFFREY BEYERL, KEVIN JAMES, CATHERINE TRENTACOSTE, AND HUI XUE Abstract. We prove that the product of two neary hoomorphic Hece eigenforms is again a Hece eigenform

More information

Throughput Optimal Scheduling for Wireless Downlinks with Reconfiguration Delay

Throughput Optimal Scheduling for Wireless Downlinks with Reconfiguration Delay Throughput Optima Scheduing for Wireess Downinks with Reconfiguration Deay Vineeth Baa Sukumaran vineethbs@gmai.com Department of Avionics Indian Institute of Space Science and Technoogy. Abstract We consider

More information

Random maps and attractors in random Boolean networks

Random maps and attractors in random Boolean networks LU TP 04-43 Rom maps attractors in rom Booean networks Björn Samuesson Car Troein Compex Systems Division, Department of Theoretica Physics Lund University, Sövegatan 4A, S-3 6 Lund, Sweden Dated: 005-05-07)

More information

2M2. Fourier Series Prof Bill Lionheart

2M2. Fourier Series Prof Bill Lionheart M. Fourier Series Prof Bi Lionheart 1. The Fourier series of the periodic function f(x) with period has the form f(x) = a 0 + ( a n cos πnx + b n sin πnx ). Here the rea numbers a n, b n are caed the Fourier

More information

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain CORRECTIONS TO CLASSICAL PROCEDURES FOR ESTIMATING THURSTONE S CASE V MODEL FOR RANKING DATA Aberto Maydeu Oivares Instituto de Empresa Marketing Dept. C/Maria de Moina -5 28006 Madrid Spain Aberto.Maydeu@ie.edu

More information

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1 Inductive Bias: How to generaize on nove data CS 478 - Inductive Bias 1 Overfitting Noise vs. Exceptions CS 478 - Inductive Bias 2 Non-Linear Tasks Linear Regression wi not generaize we to the task beow

More information

The Group Structure on a Smooth Tropical Cubic

The Group Structure on a Smooth Tropical Cubic The Group Structure on a Smooth Tropica Cubic Ethan Lake Apri 20, 2015 Abstract Just as in in cassica agebraic geometry, it is possibe to define a group aw on a smooth tropica cubic curve. In this note,

More information

14 Separation of Variables Method

14 Separation of Variables Method 14 Separation of Variabes Method Consider, for exampe, the Dirichet probem u t = Du xx < x u(x, ) = f(x) < x < u(, t) = = u(, t) t > Let u(x, t) = T (t)φ(x); now substitute into the equation: dt

More information

Lecture Notes 4: Fourier Series and PDE s

Lecture Notes 4: Fourier Series and PDE s Lecture Notes 4: Fourier Series and PDE s 1. Periodic Functions A function fx defined on R is caed a periodic function if there exists a number T > such that fx + T = fx, x R. 1.1 The smaest number T for

More information

Provisions estimation for portfolio of CDO in Gaussian financial environment

Provisions estimation for portfolio of CDO in Gaussian financial environment Technica report, IDE1123, October 27, 2011 Provisions estimation for portfoio of CDO in Gaussian financia environment Master s Thesis in Financia Mathematics Oeg Maximchuk and Yury Vokov Schoo of Information

More information

SVM: Terminology 1(6) SVM: Terminology 2(6)

SVM: Terminology 1(6) SVM: Terminology 2(6) Andrew Kusiak Inteigent Systems Laboratory 39 Seamans Center he University of Iowa Iowa City, IA 54-57 SVM he maxima margin cassifier is simiar to the perceptron: It aso assumes that the data points are

More information

The EM Algorithm applied to determining new limit points of Mahler measures

The EM Algorithm applied to determining new limit points of Mahler measures Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,

More information

Volume 13, MAIN ARTICLES

Volume 13, MAIN ARTICLES Voume 13, 2009 1 MAIN ARTICLES THE BASIC BVPs OF THE THEORY OF ELASTIC BINARY MIXTURES FOR A HALF-PLANE WITH CURVILINEAR CUTS Bitsadze L. I. Vekua Institute of Appied Mathematics of Iv. Javakhishvii Tbiisi

More information

Assignment 7 Due Tuessday, March 29, 2016

Assignment 7 Due Tuessday, March 29, 2016 Math 45 / AMCS 55 Dr. DeTurck Assignment 7 Due Tuessday, March 9, 6 Topics for this week Convergence of Fourier series; Lapace s equation and harmonic functions: basic properties, compuations on rectanges

More information

Automobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn

Automobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn Automobie Prices in Market Equiibrium Berry, Pakes and Levinsohn Empirica Anaysis of demand and suppy in a differentiated products market: equiibrium in the U.S. automobie market. Oigopoistic Differentiated

More information

Turbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University

Turbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University Turbo Codes Coding and Communication Laboratory Dept. of Eectrica Engineering, Nationa Chung Hsing University Turbo codes 1 Chapter 12: Turbo Codes 1. Introduction 2. Turbo code encoder 3. Design of intereaver

More information

Nearly Optimal Constructions of PIR and Batch Codes

Nearly Optimal Constructions of PIR and Batch Codes arxiv:700706v [csit] 5 Jun 07 Neary Optima Constructions of PIR and Batch Codes Hia Asi Technion - Israe Institute of Technoogy Haifa 3000, Israe shea@cstechnionaci Abstract In this work we study two famiies

More information

Integrating Factor Methods as Exponential Integrators

Integrating Factor Methods as Exponential Integrators Integrating Factor Methods as Exponentia Integrators Borisav V. Minchev Department of Mathematica Science, NTNU, 7491 Trondheim, Norway Borko.Minchev@ii.uib.no Abstract. Recenty a ot of effort has been

More information