Tight Bounds for Distributed Functional Monitoring

Size: px
Start display at page:

Download "Tight Bounds for Distributed Functional Monitoring"

Transcription

1 Tight Bounds for Distributed Functional Monitoring Qin Zhang MADALGO, Aarhus University Joint with David Woodruff, IBM Almaden NII Shonan meeting, Japan Jan

2 The distributed streaming model (a.k.a. distributed functional/continuous monitoring) coordinator C sites S 1 S 2 S 3 S k A(t) : set of elements received up to time t from all sites. a a Assume 1 item comes at each time unit. time 2-1

3 The distributed streaming model (a.k.a. distributed functional/continuous monitoring) coordinator C The coordinator needs to maintain f(a(t)) for all t. sites S 1 S 2 S 3 S k A(t) : set of elements received up to time t from all sites. a a Assume 1 item comes at each time unit. time 2-2

4 The distributed streaming model (a.k.a. distributed functional/continuous monitoring) coordinator C The coordinator needs to maintain f(a(t)) for all t. sites S 1 S 2 S 3 S k A(t) : set of elements received up to time t from all sites. a a Assume 1 item comes at each time unit. time Goal: minimize communication cost 2-3

5 Problems coordinator C Static case (a one-shot/static computation at the end) Top-k sites S 1 S 2 S 3 S k Heavy-hitter... Dynamic case Samplings time The Distributed Streaming Model Frequent moments (F 0, F 1, F 2,...) Heavy-hitter Quantile Entropy Non-linear functions

6 This talk What you would like to see: Efficient algorithms/protocols Practical heuristics 4-1

7 This talk What you would like to see: Efficient algorithms/protocols Practical heuristics What you (probably) do not want to see: Useless impossibility results Complicated proofs 4-2

8 This talk What you would like to see: Efficient algorithms/protocols Practical heuristics What you (probably) do not want to see: Useless impossibility results Complicated proofs Unfortunately, in the next 30 minutes

9 The multiparty communication model A model for lower bounds x 1 = x 2 = x k = x 3 = We want to compute f(x 1, x 2,..., x k ) f can be bit-wise XOR, OR, AND, MAJ

10 The multiparty communication model A model for lower bounds x 1 = x 2 = Blackboard: One speaks, everyone else hears. Message passing: If x 1 talks to x 2, others cannot hear. x k = x 3 = We want to compute f(x 1, x 2,..., x k ) f can be bit-wise XOR, OR, AND, MAJ

11 The multiparty communication model A model for lower bounds x 1 = x 2 = Blackboard: One speaks, everyone else hears. Message passing: If x 1 talks to x 2, others cannot hear. x k = x 3 = = coordinator C We want to compute f(x 1, x 2,..., x k ) f can be bit-wise XOR, OR, AND, MAJ... S 1 S 2 S 3 S k sites 5-3

12 Previously Some works in the blackboard model. Almost nothing in the message-passing model. 6-1

13 Previously Some works in the blackboard model. Almost nothing in the message-passing model. This SODA, with Jeff Phillips and Elad Verbin we proposed a general and elegant technique called symmetrization which works in both variants. In particular, we obtained (in the message-passing model) 1. Ω(nk) for the bitwise-xor/or/and/maj. 2. Ω(nk) for connectivity. 6-2

14 Previously Some works in the blackboard model. Almost nothing in the message-passing model. This SODA, with Jeff Phillips and Elad Verbin we proposed a general and elegant technique called symmetrization which works in both variants. In particular, we obtained (in the message-passing model) 1. Ω(nk) for the bitwise-xor/or/and/maj. 2. Ω(nk) for connectivity. Artificial? Well

15 Previously Some works in the blackboard model. Almost nothing in the message-passing model. This SODA, with Jeff Phillips and Elad Verbin we proposed a general and elegant technique called symmetrization which works in both variants. In particular, we obtained (in the message-passing model) 1. Ω(nk) for the bitwise-xor/or/and/maj. 2. Ω(nk) for connectivity. Artificial? Well... In any case, let s look at real important problems. 6-4

16 Now, important problems Samplings Frequent moments (F 0, F 1, F 2,...) Heavy-hitter Quantile Entropy

17 Now, important problems Samplings Solved Frequent moments (F 0, F 1, F 2,...) Heavy-hitter Quantile Entropy

18 Now, important problems Samplings Solved Frequent moments (F 0, F 1, F 2,...) Heavy-hitter Quantile Our work Entropy

19 8-1 Results

20 Results (Almost) tight bounds for all these questions Static lower bounds (almost) match dynamic upper bounds. (up to polylog factors) 8-2

21 Results Today (Almost) tight bounds for all these questions Static lower bounds (almost) match dynamic upper bounds. (up to polylog factors) 8-3

22 F 0 upper bound (Cormode, Muthu and Yi 2008) 9-1

23 The (1 + ε)-approximation F 0 problem We have k sites S 1, S 2,..., S k. S i holds a set X i. Our goal: compute F 0 ( i k X i ) up to (1 + ε)-approximation How many distinct items? A fundamental problem in data analysis. 10-1

24 The (1 + ε)-approximation F 0 problem We have k sites S 1, S 2,..., S k. S i holds a set X i. Our goal: compute F 0 ( i k X i ) up to (1 + ε)-approximation How many distinct items? A fundamental problem in data analysis. Current best UB: Õ(k/ε2 ) (Cormode, Muthu, Yi 2008) Holds in the dynamic case. 10-2

25 General idea for the one-shot computation Each site generates a sketch via small-space streaming algorithms. The coordinator combines (via communication) the sketches from the k sites to obtain a global sketch, from which we can extract the answer. 11-1

26 The FM sketch Take a pair-wise independent random hash function h : {1,..., n} {1,..., 2 d }, where 2 d > n 12-1

27 The FM sketch Take a pair-wise independent random hash function h : {1,..., n} {1,..., 2 d }, where 2 d > n For each incoming element x, compute h(x) e.g., h(5) = Count how many trailing zeros Remember the max # trailing zeroes in any h(x) 12-2

28 The FM sketch Take a pair-wise independent random hash function h : {1,..., n} {1,..., 2 d }, where 2 d > n For each incoming element x, compute h(x) e.g., h(5) = Count how many trailing zeros Remember the max # trailing zeroes in any h(x) Let Y be the max # trailing zeroes Can show E[2 Y ] = #distinct elements 12-3

29 One-shot case, the FM sketch (cont.) So 2 Y is an unbiased estimator for # distinct elements 13-1

30 One-shot case, the FM sketch (cont.) So 2 Y is an unbiased estimator for # distinct elements However, has a large variance Some techniques [Bar-Yossef et. al. 2002] can produce a good estimator that has probability 1 δ to be within relative error ε. Space increased to Õ(1/ε2 ) 13-2

31 One-shot case, the FM sketch (cont.) So 2 Y is an unbiased estimator for # distinct elements However, has a large variance Some techniques [Bar-Yossef et. al. 2002] can produce a good estimator that has probability 1 δ to be within relative error ε. Space increased to Õ(1/ε2 ) FM sketch has linearity Y 1 from A, Y 2 from B, then 2 max{y 1,Y 2 } estimates # distinct items in A B. 13-3

32 One-shot case, the FM sketch (cont.) So 2 Y is an unbiased estimator for # distinct elements However, has a large variance Some techniques [Bar-Yossef et. al. 2002] can produce a good estimator that has probability 1 δ to be within relative error ε. Space increased to Õ(1/ε2 ) FM sketch has linearity Y 1 from A, Y 2 from B, then 2 max{y 1,Y 2 } estimates # distinct items in A B. Thus, we can use it to design a one-shot algorithm with communication Õ(k/ε2 ) 13-4

33 14-1 F 0 lower bound

34 The F 0 problem We have k sites S 1, S 2,..., S k. S i holds a set X i. Our goal: compute F 0 ( i k X i ) up to (1 + ε)-approximation How many distinct items? A fundamental problem in data analysis. 15-1

35 The F 0 problem We have k sites S 1, S 2,..., S k. S i holds a set X i. Our goal: compute F 0 ( i k X i ) up to (1 + ε)-approximation How many distinct items? A fundamental problem in data analysis Current best UB: Õ(k/ε 2 ) (Cormode, Muthu, Yi, 2008) Holds in the dynamic case. Previous LB: Ω(k) (Cormode, Muthu, Yi, 2008) Ω(1/ε 2 ) (reduction from Gap-Hamming) Our LB: Ω(k/ε 2 ). Holds in the static and message-passing case.

36 The F 0 problem We have k sites S 1, S 2,..., S k. S i holds a set X i. Our goal: compute F 0 ( i k X i ) up to (1 + ε)-approximation How many distinct items? A fundamental problem in data analysis Tight! Current best UB: Õ(k/ε 2 ) (Cormode, Muthu, Yi, 2008) Holds in the dynamic case. Previous LB: Ω(k) (Cormode, Muthu, Yi, 2008) Ω(1/ε 2 ) (reduction from Gap-Hamming) Our LB: Ω(k/ε 2 ). Holds in the static and message-passing case.

37 The proof framework Step 1: We first introduce a simpler problem called k-gap-maj Step 2: We compose k-gap-maj with the Set Disjointness problem using information cost to prove a lower bound for F

38 k-gap-maj We have k sites S 1, S 2,..., S k. S i holds a bit Z i which is 1 w.p. β and 0 w.p. 1 β where ω(1/k) β 1/2 is a prefixed value. Our goal: compute the following function. GM(Z 1, Z 2,..., Z k ) = 0, 1, if i [k] Z i βk βk, if i [k] Z i βk + βk,, otherwise, where means that the answer can be arbitrary. 17-1

39 k-gap-maj We have k sites S 1, S 2,..., S k. S i holds a bit Z i which is 1 w.p. β and 0 w.p. 1 β where ω(1/k) β 1/2 is a prefixed value. Our goal: compute the following function. GM(Z 1, Z 2,..., Z k ) = 0, 1, if i [k] Z i βk βk, if i [k] Z i βk + βk,, otherwise, where means that the answer can be arbitrary. Lemma 1: If a protocol P computes k-gap-maj correctly w.p , then w.p. Ω(1), the protocol has to learn at least Ω(k) of Z i each with Ω(1) bit (that is, H(Z i Π) H b (0.01β)). 17-2

40 k-gap-maj We have k sites S 1, S 2,..., S k. S i holds a bit Z i which is 1 w.p. β and 0 w.p. 1 β where ω(1/k) β 1/2 is a prefixed value. Our goal: compute the following function. GM(Z 1, Z 2,..., Z k ) = 0, 1, if i [k] Z i βk βk, if i [k] Z i βk + βk,, otherwise, where means that the answer can be arbitrary. Lemma 1: If a protocol P computes k-gap-maj correctly w.p , then w.p. Ω(1), the protocol has to learn at least Ω(k) of Z i each with Ω(1) bit (that is, H(Z i Π) H b (0.01β)). Alternatively: I(Z 1, Z 2,..., Z k ; Π) = Ω(k) 17-3

41 Set disjointness (2-DISJ) Alice x {0, 1} n x y =? Bob y {0, 1} n 18-1

42 Set disjointness (2-DISJ) Alice x {0, 1} n x y =? Bob y {0, 1} n A classical hard instance: Distribution µ: X and Y are both random subsets of size l = (n+1)/4 from [n] such that X Y = 1 w.p. β and X Y = 0 w.p. 1 β. Razborov [1990] shows an Ω(n) for this hard distribution and error β/

43 Next step: Compose k-gap-maj with 2-DISJ n = Θ(1/ε 2 ) l = (n + 1)/4 β = 1/kε 2 coordinator C sites S 1 S 2 S 3 S k 19-1

44 Next step: Compose k-gap-maj with 2-DISJ n = Θ(1/ε 2 ) l = (n + 1)/4 β = 1/kε 2 coordinator C Y Step 1: Pick Y = y [n] of size l uniformly at random sites S 1 S 2 S 3 S k 19-2

45 Next step: Compose k-gap-maj with 2-DISJ n = Θ(1/ε 2 ) l = (n + 1)/4 β = 1/kε 2 coordinator C Y Step 1: Pick Y = y [n] of size l uniformly at random sites S 1 S 2 S 3 S k X 1 X 2 X 3 X k Step 2: Pick X 1,..., X k [n] indepedently and randomly from µ Y =y 19-3

46 Next step: Compose k-gap-maj with 2-DISJ n = Θ(1/ε 2 ) l = (n + 1)/4 β = 1/kε 2 coordinator C Y Step 1: Pick Y = y [n] of size l uniformly at random F 0 (X 1, X 2,..., X k )? sites S 1 S 2 S 3 S k X 1 X 2 X 3 X k Step 2: Pick X 1,..., X k [n] indepedently and randomly from µ Y =y 19-4

47 The proof coordinator C Y Z i = X i Y { 1 w.p. β 0 w.p. 1 β sites S 1 S 2 S 3 S k X 1 X 2 X 3 X k F 0 (X 1, X 2,..., X k ) k-gap-maj(z 1, Z 2,..., Z k ) (Z i = X i Y ) learn Ω(k) Z i s well (by Lemma 1) need Ω(k/ε 2 ) bits (learning each Z i = X i Y well needs Ω(n) = Ω(1/ε 2 ) bits, by 2-DISJ) 20-1

48 The proof coordinator C Y Z i = X i Y { 1 w.p. β 0 w.p. 1 β sites S 1 S 2 S 3 S k X 1 X 2 X 3 X k F 0 (X 1, X 2,..., X k ) k-gap-maj(z 1, Z 2,..., Z k ) Q.E.D. (Z i = X i Y ) learn Ω(k) Z i s well (by Lemma 1) need Ω(k/ε 2 ) bits (learning each Z i = X i Y well needs Ω(n) = Ω(1/ε 2 ) bits, by 2-DISJ) 20-2

49 Proof sketch of Lemma 1 Lemma 1: If a protocol P computes k-gap-maj correctly w.p , then w.p. Ω(1), for Ω(k) Z i s, we have H(Z i Π) H b (0.01β). Proof: 1. Suppose Π does not satisfy this. 2. Since the Z i are independent given Π, k i=1 Z i Π is a sum of independent Bernoulli random variables. 3. Since most H(Z i Π) are large, by anti-concentration, both of the following events occur with constant probability: k i=1 Z i Π > βk + βk, k i=1 Z i Π < βk βk. 4. So P can t succeed with large probability. 21-1

50 22-1 F 2 lower bound

51 The F 2 problem We have k sites S 1, S 2,..., S k. S i holds a set X i. Our goal: compute F 2 ( i k X i ) up to (1 + ε)-approximation Join What s the size of self-join? Another fundamental problem in data analysis. 23-1

52 The F 2 problem We have k sites S 1, S 2,..., S k. S i holds a set X i. Our goal: compute F 2 ( i k X i ) up to (1 + ε)-approximation Join What s the size of self-join? Another fundamental problem in data analysis. Previous UB: Õ(k 2 /ε + k 1.5 /ε 3 ) (Cormode, Muthu, Yi 2008) Our UB: Õ(k/poly(ε)), one way protocol Holds in the dynamic case. Previous LB: Ω(k) (Cormode, Muthu, Yi, 2008) Our LB: Ω(k/ε 2 ). Holds in the static and blackboard case. 23-2

53 The F 2 problem We have k sites S 1, S 2,..., S k. S i holds a set X i. Our goal: compute F 2 ( i k X i ) up to (1 + ε)-approximation Join Almost Tight! What s the size of self-join? Another fundamental problem in data analysis. Previous UB: Õ(k 2 /ε + k 1.5 /ε 3 ) (Cormode, Muthu, Yi 2008) Our UB: Õ(k/poly(ε)), one way protocol Holds in the dynamic case. Previous LB: Ω(k) (Cormode, Muthu, Yi, 2008) Our LB: Ω(k/ε 2 ). Holds in the static and blackboard case. 23-3

54 A quick glance: (1 + ε)-approximation F 2 2-party gap-hamming: Alice has X = {X 1, X 2,..., X 1/ε 2}, Bob has Y = {Y 1, Y 2,..., Y 1/ε 2}. They want to compute: GHD(X, Y ) = 0, 1, if i [1/ε 2 ] X i Y i 1/2ε 2 1/ε, if i [1/ε 2 ] X i Y i 1/2ε 2 + 1/ε,, otherwise, where means that the answer can be arbitrary. 24-1

55 A quick glance: (1 + ε)-approximation F 2 2-party gap-hamming: Alice has X = {X 1, X 2,..., X 1/ε 2}, Bob has Y = {Y 1, Y 2,..., Y 1/ε 2}. They want to compute: GHD(X, Y ) = 0, 1, if i [1/ε 2 ] X i Y i 1/2ε 2 1/ε, if i [1/ε 2 ] X i Y i 1/2ε 2 + 1/ε,, otherwise, where means that the answer can be arbitrary. k-disj: We have k sites S 1, S 2,..., S k. S i holds a set Z i. We promise that either Z i (i = 1,..., k) are all disjoint, or they intersect on one element and the rest are all disjoint (sun-flower). The goal is to find out which is the case. 24-2

56 A quick glance: (1 + ε)-approximation F 2 2-party gap-hamming: Alice has X = {X 1, X 2,..., X 1/ε 2}, Bob has Y = {Y 1, Y 2,..., Y 1/ε 2}. They want to compute: GHD(X, Y ) = 0, 1, if i [1/ε 2 ] X i Y i 1/2ε 2 1/ε, if i [1/ε 2 ] X i Y i 1/2ε 2 + 1/ε,, otherwise, where means that the answer can be arbitrary. k-xor 2 copies k-disj: We have k sites S 1, S 2,..., S k. S i holds a set Z i. We promise that either Z i (i = 1,..., k) are all disjoint, or they intersect on one element and the rest are all disjoint (sun-flower). The goal is to find out which is the case. 24-3

57 A quick glance: (1 + ε)-approximation F 2 2-party gap-hamming: Alice has X = {X 1, X 2,..., X 1/ε 2}, Bob has Y = {Y 1, Y 2,..., Y 1/ε 2}. They want to compute: GHD(X, Y ) = 0, 1, if i [1/ε 2 ] X i Y i 1/2ε 2 1/ε, if i [1/ε 2 ] X i Y i 1/2ε 2 + 1/ε,, otherwise, where means that the answer can be arbitrary. k-xor 2 copies compose via information cost k-bta k-disj: We have k sites S 1, S 2,..., S k. S i holds a set Z i. We promise that either Z i (i = 1,..., k) are all disjoint, or they intersect on one element and the rest are all disjoint (sun-flower). The goal is to find out which is the case. 24-4

58 A quick glance: (1 + ε)-approximation F 2 2-party gap-hamming: Alice has X = {X 1, X 2,..., X 1/ε 2}, Bob has Y = {Y 1, Y 2,..., Y 1/ε 2}. They want to compute: GHD(X, Y ) = 0, 1, if i [1/ε 2 ] X i Y i 1/2ε 2 1/ε, if i [1/ε 2 ] X i Y i 1/2ε 2 + 1/ε,, otherwise, where means that the answer can be arbitrary. k-xor 2 copies compose via information cost k-bta CC(k-BTA) = Ω(k/ε 2 ) k-disj: We have k sites S 1, S 2,..., S k. S i holds a set Z i. We promise that either Z i (i = 1,..., k) are all disjoint, or they intersect on one element and the rest are all disjoint (sun-flower). The goal is to find out which is the case. 24-5

59 A quick glance: (1 + ε)-approximation F 2 2-party gap-hamming: Alice has X = {X 1, X 2,..., X 1/ε 2}, Bob has Y = {Y 1, Y 2,..., Y 1/ε 2}. They want to compute: GHD(X, Y ) = 0, 1, if i [1/ε 2 ] X i Y i 1/2ε 2 1/ε, if i [1/ε 2 ] X i Y i 1/2ε 2 + 1/ε,, otherwise, where means that the answer can be arbitrary. k-xor 2 copies compose via information cost k-bta CC(k-BTA) = Ω(k/ε 2 ) k-disj: We have k sites S 1, S 2,..., S k. S i holds a set Z i. We promise that either Z i (i = 1,..., k) are all disjoint, or they intersect on one element and the rest are all disjoint (sun-flower). The goal is to find out which is the case. Finally, we reduce F 2 to k-bta. 24-6

60 The end T HAN K YOU Q and A 25-1

Tight Bounds for Distributed Streaming

Tight Bounds for Distributed Streaming Tight Bounds for Distributed Streaming (a.k.a., Distributed Functional Monitoring) David Woodruff IBM Research Almaden Qin Zhang MADALGO, Aarhus Univ. STOC 12 May 22, 2012 1-1 The distributed streaming

More information

Lower Bound Techniques for Multiparty Communication Complexity

Lower Bound Techniques for Multiparty Communication Complexity Lower Bound Techniques for Multiparty Communication Complexity Qin Zhang Indiana University Bloomington Based on works with Jeff Phillips, Elad Verbin and David Woodruff 1-1 The multiparty number-in-hand

More information

Lower Bounds for Number-in-Hand Multiparty Communication Complexity, Made Easy

Lower Bounds for Number-in-Hand Multiparty Communication Complexity, Made Easy Lower Bounds for Number-in-Hand Multiparty Communication Complexity, Made Easy Jeff M. Phillips School of Computing University of Utah jeffp@cs.utah.edu Elad Verbin Dept. of Computer Science Aarhus University,

More information

c 2016 Jeff M. Phillips, Elad Verbin, and Qin Zhang

c 2016 Jeff M. Phillips, Elad Verbin, and Qin Zhang SIAM J. COMPUT. Vol. 45, No. 1, pp. 174 196 c 2016 Jeff M. Phillips, Elad Verbin, and Qin Zhang LOWER BOUNDS FOR NUMBER-IN-HAND MULTIPARTY COMMUNICATION COMPLEXITY, MADE EASY JEFF M. PHILLIPS, ELAD VERBIN,

More information

Distributed Statistical Estimation of Matrix Products with Applications

Distributed Statistical Estimation of Matrix Products with Applications Distributed Statistical Estimation of Matrix Products with Applications David Woodruff CMU Qin Zhang IUB PODS 2018 June, 2018 1-1 The Distributed Computation Model p-norms, heavy-hitters,... A {0, 1} m

More information

CS Foundations of Communication Complexity

CS Foundations of Communication Complexity CS 2429 - Foundations of Communication Complexity Lecturer: Sergey Gorbunov 1 Introduction In this lecture we will see how to use methods of (conditional) information complexity to prove lower bounds for

More information

Computing the Entropy of a Stream

Computing the Entropy of a Stream Computing the Entropy of a Stream To appear in SODA 2007 Graham Cormode graham@research.att.com Amit Chakrabarti Dartmouth College Andrew McGregor U. Penn / UCSD Outline Introduction Entropy Upper Bound

More information

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems Arnab Bhattacharyya, Palash Dey, and David P. Woodruff Indian Institute of Science, Bangalore {arnabb,palash}@csa.iisc.ernet.in

More information

Streaming and communication complexity of Hamming distance

Streaming and communication complexity of Hamming distance Streaming and communication complexity of Hamming distance Tatiana Starikovskaya IRIF, Université Paris-Diderot (Joint work with Raphaël Clifford, ICALP 16) Approximate pattern matching Problem Pattern

More information

Linear Sketches A Useful Tool in Streaming and Compressive Sensing

Linear Sketches A Useful Tool in Streaming and Compressive Sensing Linear Sketches A Useful Tool in Streaming and Compressive Sensing Qin Zhang 1-1 Linear sketch Random linear projection M : R n R k that preserves properties of any v R n with high prob. where k n. M =

More information

Lecture Lecture 9 October 1, 2015

Lecture Lecture 9 October 1, 2015 CS 229r: Algorithms for Big Data Fall 2015 Lecture Lecture 9 October 1, 2015 Prof. Jelani Nelson Scribe: Rachit Singh 1 Overview In the last lecture we covered the distance to monotonicity (DTM) and longest

More information

Information Complexity and Applications. Mark Braverman Princeton University and IAS FoCM 17 July 17, 2017

Information Complexity and Applications. Mark Braverman Princeton University and IAS FoCM 17 July 17, 2017 Information Complexity and Applications Mark Braverman Princeton University and IAS FoCM 17 July 17, 2017 Coding vs complexity: a tale of two theories Coding Goal: data transmission Different channels

More information

Some notes on streaming algorithms continued

Some notes on streaming algorithms continued U.C. Berkeley CS170: Algorithms Handout LN-11-9 Christos Papadimitriou & Luca Trevisan November 9, 016 Some notes on streaming algorithms continued Today we complete our quick review of streaming algorithms.

More information

Sketching and Streaming for Distributions

Sketching and Streaming for Distributions Sketching and Streaming for Distributions Piotr Indyk Andrew McGregor Massachusetts Institute of Technology University of California, San Diego Main Material: Stable distributions, pseudo-random generators,

More information

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Lecture 10 Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Recap Sublinear time algorithms Deterministic + exact: binary search Deterministic + inexact: estimating diameter in a

More information

Communication-Efficient Computation on Distributed Noisy Datasets. Qin Zhang Indiana University Bloomington

Communication-Efficient Computation on Distributed Noisy Datasets. Qin Zhang Indiana University Bloomington Communication-Efficient Computation on Distributed Noisy Datasets Qin Zhang Indiana University Bloomington SPAA 15 June 15, 2015 1-1 Model of computation The coordinator model: k sites and 1 coordinator.

More information

15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018

15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018 15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018 Today we ll talk about a topic that is both very old (as far as computer science

More information

Communication Complexity 16:198:671 2/15/2010. Lecture 6. P (x) log

Communication Complexity 16:198:671 2/15/2010. Lecture 6. P (x) log Communication Complexity 6:98:67 2/5/200 Lecture 6 Lecturer: Nikos Leonardos Scribe: Troy Lee Information theory lower bounds. Entropy basics Let Ω be a finite set and P a probability distribution on Ω.

More information

A Near-Optimal Algorithm for Computing the Entropy of a Stream

A Near-Optimal Algorithm for Computing the Entropy of a Stream A Near-Optimal Algorithm for Computing the Entropy of a Stream Amit Chakrabarti ac@cs.dartmouth.edu Graham Cormode graham@research.att.com Andrew McGregor andrewm@seas.upenn.edu Abstract We describe a

More information

Lecture 16: Communication Complexity

Lecture 16: Communication Complexity CSE 531: Computational Complexity I Winter 2016 Lecture 16: Communication Complexity Mar 2, 2016 Lecturer: Paul Beame Scribe: Paul Beame 1 Communication Complexity In (two-party) communication complexity

More information

Algorithms for Querying Noisy Distributed/Streaming Datasets

Algorithms for Querying Noisy Distributed/Streaming Datasets Algorithms for Querying Noisy Distributed/Streaming Datasets Qin Zhang Indiana University Bloomington Sublinear Algo Workshop @ JHU Jan 9, 2016 1-1 The big data models The streaming model (Alon, Matias

More information

B669 Sublinear Algorithms for Big Data

B669 Sublinear Algorithms for Big Data B669 Sublinear Algorithms for Big Data Qin Zhang 1-1 2-1 Part 1: Sublinear in Space The model and challenge The data stream model (Alon, Matias and Szegedy 1996) a n a 2 a 1 RAM CPU Why hard? Cannot store

More information

Near-Optimal Lower Bounds on the Multi-Party Communication Complexity of Set Disjointness

Near-Optimal Lower Bounds on the Multi-Party Communication Complexity of Set Disjointness Near-Optimal Lower Bounds on the Multi-Party Communication Complexity of Set Disjointness Amit Chakrabarti School of Mathematics Institute for Advanced Study Princeton, NJ 08540 amitc@ias.edu Subhash Khot

More information

CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms

CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms Instructor: Mohammad T. Hajiaghayi Scribe: Huijing Gong November 11, 2014 1 Overview In the

More information

Complexity of Finding a Duplicate in a Stream: Simple Open Problems

Complexity of Finding a Duplicate in a Stream: Simple Open Problems Complexity of Finding a Duplicate in a Stream: Simple Open Problems Jun Tarui Univ of Electro-Comm, Tokyo Shonan, Jan 2012 1 Duplicate Finding Problem Given: Stream a 1,,a m a i {1,,n} Assuming m > n,

More information

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems Arnab Bhattacharyya Indian Institute of Science, Bangalore arnabb@csa.iisc.ernet.in Palash Dey Indian Institute of

More information

On the tightness of the Buhrman-Cleve-Wigderson simulation

On the tightness of the Buhrman-Cleve-Wigderson simulation On the tightness of the Buhrman-Cleve-Wigderson simulation Shengyu Zhang Department of Computer Science and Engineering, The Chinese University of Hong Kong. syzhang@cse.cuhk.edu.hk Abstract. Buhrman,

More information

An exponential separation between quantum and classical one-way communication complexity

An exponential separation between quantum and classical one-way communication complexity An exponential separation between quantum and classical one-way communication complexity Ashley Montanaro Centre for Quantum Information and Foundations, Department of Applied Mathematics and Theoretical

More information

Stream Computation and Arthur- Merlin Communication

Stream Computation and Arthur- Merlin Communication Stream Computation and Arthur- Merlin Communication Justin Thaler, Yahoo! Labs Joint Work with: Amit Chakrabarti, Dartmouth Graham Cormode, University of Warwick Andrew McGregor, Umass Amherst Suresh Venkatasubramanian,

More information

Algorithms for Distributed Functional Monitoring

Algorithms for Distributed Functional Monitoring Algorithms for Distributed Functional Monitoring Graham Cormode AT&T Labs and S. Muthukrishnan Google Inc. and Ke Yi Hong Kong University of Science and Technology We study what we call functional monitoring

More information

CS Introduction to Complexity Theory. Lecture #11: Dec 8th, 2015

CS Introduction to Complexity Theory. Lecture #11: Dec 8th, 2015 CS 2401 - Introduction to Complexity Theory Lecture #11: Dec 8th, 2015 Lecturer: Toniann Pitassi Scribe Notes by: Xu Zhao 1 Communication Complexity Applications Communication Complexity (CC) has many

More information

Topology Matters in Communication

Topology Matters in Communication Electronic Colloquium on Computational Complexity, Report No. 74 204 Topology Matters in Communication Arkadev Chattopadhyay Jaikumar Radhakrishnan Atri Rudra May 4, 204 Abstract We provide the first communication

More information

Space- Efficient Es.ma.on of Sta.s.cs over Sub- Sampled Streams

Space- Efficient Es.ma.on of Sta.s.cs over Sub- Sampled Streams Space- Efficient Es.ma.on of Sta.s.cs over Sub- Sampled Streams Andrew McGregor University of Massachuse7s mcgregor@cs.umass.edu A. Pavan Iowa State University pavan@cs.iastate.edu Srikanta Tirthapura

More information

A Tight Lower Bound for Dynamic Membership in the External Memory Model

A Tight Lower Bound for Dynamic Membership in the External Memory Model A Tight Lower Bound for Dynamic Membership in the External Memory Model Elad Verbin ITCS, Tsinghua University Qin Zhang Hong Kong University of Science & Technology April 2010 1-1 The computational model

More information

Nondeterminism LECTURE Nondeterminism as a proof system. University of California, Los Angeles CS 289A Communication Complexity

Nondeterminism LECTURE Nondeterminism as a proof system. University of California, Los Angeles CS 289A Communication Complexity University of California, Los Angeles CS 289A Communication Complexity Instructor: Alexander Sherstov Scribe: Matt Brown Date: January 25, 2012 LECTURE 5 Nondeterminism In this lecture, we introduce nondeterministic

More information

11 Heavy Hitters Streaming Majority

11 Heavy Hitters Streaming Majority 11 Heavy Hitters A core mining problem is to find items that occur more than one would expect. These may be called outliers, anomalies, or other terms. Statistical models can be layered on top of or underneath

More information

Linear sketching for Functions over Boolean Hypercube

Linear sketching for Functions over Boolean Hypercube Linear sketching for Functions over Boolean Hypercube Grigory Yaroslavtsev (Indiana University, Bloomington) http://grigory.us with Sampath Kannan (U. Pennsylvania), Elchanan Mossel (MIT) and Swagato Sanyal

More information

Distributed Summaries

Distributed Summaries Distributed Summaries Graham Cormode graham@research.att.com Pankaj Agarwal(Duke) Zengfeng Huang (HKUST) Jeff Philips (Utah) Zheiwei Wei (HKUST) Ke Yi (HKUST) Summaries Summaries allow approximate computations:

More information

Lecture 16 Oct 21, 2014

Lecture 16 Oct 21, 2014 CS 395T: Sublinear Algorithms Fall 24 Prof. Eric Price Lecture 6 Oct 2, 24 Scribe: Chi-Kit Lam Overview In this lecture we will talk about information and compression, which the Huffman coding can achieve

More information

25.2 Last Time: Matrix Multiplication in Streaming Model

25.2 Last Time: Matrix Multiplication in Streaming Model EE 381V: Large Scale Learning Fall 01 Lecture 5 April 18 Lecturer: Caramanis & Sanghavi Scribe: Kai-Yang Chiang 5.1 Review of Streaming Model Streaming model is a new model for presenting massive data.

More information

Improved Algorithms for Distributed Entropy Monitoring

Improved Algorithms for Distributed Entropy Monitoring Improved Algorithms for Distributed Entropy Monitoring Jiecao Chen Indiana University Bloomington jiecchen@umail.iu.edu Qin Zhang Indiana University Bloomington qzhangcs@indiana.edu Abstract Modern data

More information

Lecture 3 Sept. 4, 2014

Lecture 3 Sept. 4, 2014 CS 395T: Sublinear Algorithms Fall 2014 Prof. Eric Price Lecture 3 Sept. 4, 2014 Scribe: Zhao Song In today s lecture, we will discuss the following problems: 1. Distinct elements 2. Turnstile model 3.

More information

PROPERTY TESTING LOWER BOUNDS VIA COMMUNICATION COMPLEXITY

PROPERTY TESTING LOWER BOUNDS VIA COMMUNICATION COMPLEXITY PROPERTY TESTING LOWER BOUNDS VIA COMMUNICATION COMPLEXITY Eric Blais, Joshua Brody, and Kevin Matulef February 1, 01 Abstract. We develop a new technique for proving lower bounds in property testing,

More information

Quantum Communication Complexity

Quantum Communication Complexity Quantum Communication Complexity Ronald de Wolf Communication complexity has been studied extensively in the area of theoretical computer science and has deep connections with seemingly unrelated areas,

More information

14.1 Finding frequent elements in stream

14.1 Finding frequent elements in stream Chapter 14 Streaming Data Model 14.1 Finding frequent elements in stream A very useful statistics for many applications is to keep track of elements that occur more frequently. It can come in many flavours

More information

Tight Bounds for Distributed Functional Monitoring

Tight Bounds for Distributed Functional Monitoring Tight Bounds for Distributed Functiona Monitoring David P. Woodruff IBM Amaden dpwoodru@us.ibm.com Qin Zhang IBM Amaden qinzhang@cse.ust.hk Abstract We resove severa fundamenta questions in the area of

More information

The Communication Complexity of Correlation. Prahladh Harsha Rahul Jain David McAllester Jaikumar Radhakrishnan

The Communication Complexity of Correlation. Prahladh Harsha Rahul Jain David McAllester Jaikumar Radhakrishnan The Communication Complexity of Correlation Prahladh Harsha Rahul Jain David McAllester Jaikumar Radhakrishnan Transmitting Correlated Variables (X, Y) pair of correlated random variables Transmitting

More information

CPSC 467: Cryptography and Computer Security

CPSC 467: Cryptography and Computer Security CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 14 October 16, 2013 CPSC 467, Lecture 14 1/45 Message Digest / Cryptographic Hash Functions Hash Function Constructions Extending

More information

Lecture 21: Quantum communication complexity

Lecture 21: Quantum communication complexity CPSC 519/619: Quantum Computation John Watrous, University of Calgary Lecture 21: Quantum communication complexity April 6, 2006 In this lecture we will discuss how quantum information can allow for a

More information

Communication vs information complexity, relative discrepancy and other lower bounds

Communication vs information complexity, relative discrepancy and other lower bounds Communication vs information complexity, relative discrepancy and other lower bounds Iordanis Kerenidis CNRS, LIAFA- Univ Paris Diderot 7 Joint work with: L. Fontes, R. Jain, S. Laplante, M. Lauriere,

More information

1 Estimating Frequency Moments in Streams

1 Estimating Frequency Moments in Streams CS 598CSC: Algorithms for Big Data Lecture date: August 28, 2014 Instructor: Chandra Chekuri Scribe: Chandra Chekuri 1 Estimating Frequency Moments in Streams A significant fraction of streaming literature

More information

Communication Lower Bounds for Statistical Estimation Problems via a Distributed Data Processing Inequality

Communication Lower Bounds for Statistical Estimation Problems via a Distributed Data Processing Inequality Communication Lower Bounds for Statistical Estimation Problems via a Distributed Data Processing Inequality Mark Braverman Ankit Garg Tengyu Ma Huy Nguyen David Woodruff DIMACS Workshop on Big Data through

More information

Lecture 2. Frequency problems

Lecture 2. Frequency problems 1 / 43 Lecture 2. Frequency problems Ricard Gavaldà MIRI Seminar on Data Streams, Spring 2015 Contents 2 / 43 1 Frequency problems in data streams 2 Approximating inner product 3 Computing frequency moments

More information

Lecture 1: Perfect Secrecy and Statistical Authentication. 2 Introduction - Historical vs Modern Cryptography

Lecture 1: Perfect Secrecy and Statistical Authentication. 2 Introduction - Historical vs Modern Cryptography CS 7880 Graduate Cryptography September 10, 2015 Lecture 1: Perfect Secrecy and Statistical Authentication Lecturer: Daniel Wichs Scribe: Matthew Dippel 1 Topic Covered Definition of perfect secrecy One-time

More information

The space complexity of approximating the frequency moments

The space complexity of approximating the frequency moments The space complexity of approximating the frequency moments Felix Biermeier November 24, 2015 1 Overview Introduction Approximations of frequency moments lower bounds 2 Frequency moments Problem Estimate

More information

A Continuous Sampling from Distributed Streams

A Continuous Sampling from Distributed Streams A Continuous Sampling from Distributed Streams Graham Cormode, AT&T Labs Research S. Muthukrishnan, Rutgers University Ke Yi, HKUST Qin Zhang, MADALGO, Aarhus University A fundamental problem in data management

More information

Communication Complexity

Communication Complexity Communication Complexity Jaikumar Radhakrishnan School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai 31 May 2012 J 31 May 2012 1 / 13 Plan 1 Examples, the model, the

More information

Sparser Johnson-Lindenstrauss Transforms

Sparser Johnson-Lindenstrauss Transforms Sparser Johnson-Lindenstrauss Transforms Jelani Nelson Princeton February 16, 212 joint work with Daniel Kane (Stanford) Random Projections x R d, d huge store y = Sx, where S is a k d matrix (compression)

More information

Maximal Noise in Interactive Communication over Erasure Channels and Channels with Feedback

Maximal Noise in Interactive Communication over Erasure Channels and Channels with Feedback Maximal Noise in Interactive Communication over Erasure Channels and Channels with Feedback Klim Efremenko UC Berkeley klimefrem@gmail.com Ran Gelles Princeton University rgelles@cs.princeton.edu Bernhard

More information

Fourier analysis of boolean functions in quantum computation

Fourier analysis of boolean functions in quantum computation Fourier analysis of boolean functions in quantum computation Ashley Montanaro Centre for Quantum Information and Foundations, Department of Applied Mathematics and Theoretical Physics, University of Cambridge

More information

Linear Index Codes are Optimal Up To Five Nodes

Linear Index Codes are Optimal Up To Five Nodes Linear Index Codes are Optimal Up To Five Nodes Lawrence Ong The University of Newcastle, Australia March 05 BIRS Workshop: Between Shannon and Hamming: Network Information Theory and Combinatorics Unicast

More information

On Quantum vs. Classical Communication Complexity

On Quantum vs. Classical Communication Complexity On Quantum vs. Classical Communication Complexity Dmitry Gavinsky Institute of Mathematics, Praha Czech Academy of Sciences Introduction Introduction The setting of communication complexity is one of the

More information

Combinatorial Auctions Do Need Modest Interaction

Combinatorial Auctions Do Need Modest Interaction Combinatorial Auctions Do Need Modest Interaction SEPEHR ASSADI, University of Pennsylvania We study the necessity of interaction for obtaining efficient allocations in combinatorial auctions with subadditive

More information

Lecture 11: Quantum Information III - Source Coding

Lecture 11: Quantum Information III - Source Coding CSCI5370 Quantum Computing November 25, 203 Lecture : Quantum Information III - Source Coding Lecturer: Shengyu Zhang Scribe: Hing Yin Tsang. Holevo s bound Suppose Alice has an information source X that

More information

Settling the Query Complexity of Non-Adaptive Junta Testing

Settling the Query Complexity of Non-Adaptive Junta Testing Settling the Query Complexity of Non-Adaptive Junta Testing Erik Waingarten, Columbia University Based on joint work with Xi Chen (Columbia University) Rocco Servedio (Columbia University) Li-Yang Tan

More information

Lecture 6 September 13, 2016

Lecture 6 September 13, 2016 CS 395T: Sublinear Algorithms Fall 206 Prof. Eric Price Lecture 6 September 3, 206 Scribe: Shanshan Wu, Yitao Chen Overview Recap of last lecture. We talked about Johnson-Lindenstrauss (JL) lemma [JL84]

More information

Hardness of MST Construction

Hardness of MST Construction Lecture 7 Hardness of MST Construction In the previous lecture, we saw that an MST can be computed in O( n log n+ D) rounds using messages of size O(log n). Trivially, Ω(D) rounds are required, but what

More information

Grothendieck Inequalities, XOR games, and Communication Complexity

Grothendieck Inequalities, XOR games, and Communication Complexity Grothendieck Inequalities, XOR games, and Communication Complexity Troy Lee Rutgers University Joint work with: Jop Briët, Harry Buhrman, and Thomas Vidick Overview Introduce XOR games, Grothendieck s

More information

Optimal Random Sampling from Distributed Streams Revisited

Optimal Random Sampling from Distributed Streams Revisited Optimal Random Sampling from Distributed Streams Revisited Srikanta Tirthapura 1 and David P. Woodruff 2 1 Dept. of ECE, Iowa State University, Ames, IA, 50011, USA. snt@iastate.edu 2 IBM Almaden Research

More information

New Characterizations in Turnstile Streams with Applications

New Characterizations in Turnstile Streams with Applications New Characterizations in Turnstile Streams with Applications Yuqing Ai 1, Wei Hu 1, Yi Li 2, and David P. Woodruff 3 1 The Institute for Theoretical Computer Science (ITCS), Institute for Interdisciplinary

More information

distributed simulation and distributed inference

distributed simulation and distributed inference distributed simulation and distributed inference Algorithms, Tradeoffs, and a Conjecture Clément Canonne (Stanford University) April 13, 2018 Joint work with Jayadev Acharya (Cornell University) and Himanshu

More information

Sparse Johnson-Lindenstrauss Transforms

Sparse Johnson-Lindenstrauss Transforms Sparse Johnson-Lindenstrauss Transforms Jelani Nelson MIT May 24, 211 joint work with Daniel Kane (Harvard) Metric Johnson-Lindenstrauss lemma Metric JL (MJL) Lemma, 1984 Every set of n points in Euclidean

More information

Simultaneous Communication Protocols with Quantum and Classical Messages

Simultaneous Communication Protocols with Quantum and Classical Messages Simultaneous Communication Protocols with Quantum and Classical Messages Oded Regev Ronald de Wolf July 17, 2008 Abstract We study the simultaneous message passing model of communication complexity, for

More information

PERFECTLY secure key agreement has been studied recently

PERFECTLY secure key agreement has been studied recently IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 2, MARCH 1999 499 Unconditionally Secure Key Agreement the Intrinsic Conditional Information Ueli M. Maurer, Senior Member, IEEE, Stefan Wolf Abstract

More information

Algebraic Problems in Computational Complexity

Algebraic Problems in Computational Complexity Algebraic Problems in Computational Complexity Pranab Sen School of Technology and Computer Science, Tata Institute of Fundamental Research, Mumbai 400005, India pranab@tcs.tifr.res.in Guide: Prof. R.

More information

Tracking Distributed Aggregates over Time-based Sliding Windows

Tracking Distributed Aggregates over Time-based Sliding Windows Tracking Distributed Aggregates over Time-based Sliding Windows Graham Cormode and Ke Yi Abstract. The area of distributed monitoring requires tracking the value of a function of distributed data as new

More information

Lecture 5: February 21, 2017

Lecture 5: February 21, 2017 COMS 6998: Advanced Complexity Spring 2017 Lecture 5: February 21, 2017 Lecturer: Rocco Servedio Scribe: Diana Yin 1 Introduction 1.1 Last Time 1. Established the exact correspondence between communication

More information

1 Secure two-party computation

1 Secure two-party computation CSCI 5440: Cryptography Lecture 7 The Chinese University of Hong Kong, Spring 2018 26 and 27 February 2018 In the first half of the course we covered the basic cryptographic primitives that enable secure

More information

1-Pass Relative-Error L p -Sampling with Applications

1-Pass Relative-Error L p -Sampling with Applications -Pass Relative-Error L p -Sampling with Applications Morteza Monemizadeh David P Woodruff Abstract For any p [0, 2], we give a -pass poly(ε log n)-space algorithm which, given a data stream of length m

More information

Communication Complexity (for Algorithm Designers)

Communication Complexity (for Algorithm Designers) arxiv:1509.06257v1 [cs.cc] 21 Sep 2015 Communication Complexity (for Algorithm Designers) Tim Roughgarden c Tim Roughgarden 2015 Preface The best algorithm designers prove both possibility and impossibility

More information

Functional Monitoring without Monotonicity

Functional Monitoring without Monotonicity Functional Monitoring without Monotonicity Chrisil Arackaparambil, Joshua Brody, and Amit Chakrabarti Department of Computer Science, Dartmouth College, Hanover, NH 03755, USA Abstract. The notion of distributed

More information

Range-efficient computation of F 0 over massive data streams

Range-efficient computation of F 0 over massive data streams Range-efficient computation of F 0 over massive data streams A. Pavan Dept. of Computer Science Iowa State University pavan@cs.iastate.edu Srikanta Tirthapura Dept. of Elec. and Computer Engg. Iowa State

More information

Norms, XOR lemmas, and lower bounds for GF(2) polynomials and multiparty protocols

Norms, XOR lemmas, and lower bounds for GF(2) polynomials and multiparty protocols Norms, XOR lemmas, and lower bounds for GF(2) polynomials and multiparty protocols Emanuele Viola, IAS (Work partially done during postdoc at Harvard) Joint work with Avi Wigderson June 2007 Basic questions

More information

A Mergeable Summaries

A Mergeable Summaries A Mergeable Summaries Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang, Jeff M. Phillips, Zhewei Wei, and Ke Yi We study the mergeability of data summaries. Informally speaking, mergeability requires

More information

ENEE 457: Computer Systems Security 09/19/16. Lecture 6 Message Authentication Codes and Hash Functions

ENEE 457: Computer Systems Security 09/19/16. Lecture 6 Message Authentication Codes and Hash Functions ENEE 457: Computer Systems Security 09/19/16 Lecture 6 Message Authentication Codes and Hash Functions Charalampos (Babis) Papamanthou Department of Electrical and Computer Engineering University of Maryland,

More information

Cache-Oblivious Hashing

Cache-Oblivious Hashing Cache-Oblivious Hashing Zhewei Wei Hong Kong University of Science & Technology Joint work with Rasmus Pagh, Ke Yi and Qin Zhang Dictionary Problem Store a subset S of the Universe U. Lookup: Does x belong

More information

Lecture 2: Perfect Secrecy and its Limitations

Lecture 2: Perfect Secrecy and its Limitations CS 4501-6501 Topics in Cryptography 26 Jan 2018 Lecture 2: Perfect Secrecy and its Limitations Lecturer: Mohammad Mahmoody Scribe: Mohammad Mahmoody 1 Introduction Last time, we informally defined encryption

More information

A Stable Marriage Requires Communication

A Stable Marriage Requires Communication A Stable Marriage Requires Communication Yannai A. Gonczarowski Noam Nisan Rafail Ostrovsky Will Rosenbaum Abstract The Gale-Shapley algorithm for the Stable Marriage Problem is known to take Θ(n 2 ) steps

More information

4/26/2017. More algorithms for streams: Each element of data stream is a tuple Given a list of keys S Determine which tuples of stream are in S

4/26/2017. More algorithms for streams: Each element of data stream is a tuple Given a list of keys S Determine which tuples of stream are in S Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit

More information

Cell-Probe Lower Bounds for Prefix Sums and Matching Brackets

Cell-Probe Lower Bounds for Prefix Sums and Matching Brackets Cell-Probe Lower Bounds for Prefix Sums and Matching Brackets Emanuele Viola July 6, 2009 Abstract We prove that to store strings x {0, 1} n so that each prefix sum a.k.a. rank query Sumi := k i x k can

More information

Lecture 3: Oct 7, 2014

Lecture 3: Oct 7, 2014 Information and Coding Theory Autumn 04 Lecturer: Shi Li Lecture : Oct 7, 04 Scribe: Mrinalkanti Ghosh In last lecture we have seen an use of entropy to give a tight upper bound in number of triangles

More information

A Mergeable Summaries

A Mergeable Summaries A Mergeable Summaries Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang, Jeff M. Phillips, Zhewei Wei, and Ke Yi We study the mergeability of data summaries. Informally speaking, mergeability requires

More information

Lecture 7 Limits on inapproximability

Lecture 7 Limits on inapproximability Tel Aviv University, Fall 004 Lattices in Computer Science Lecture 7 Limits on inapproximability Lecturer: Oded Regev Scribe: Michael Khanevsky Let us recall the promise problem GapCVP γ. DEFINITION 1

More information

Lecture 7: Fingerprinting. David Woodruff Carnegie Mellon University

Lecture 7: Fingerprinting. David Woodruff Carnegie Mellon University Lecture 7: Fingerprinting David Woodruff Carnegie Mellon University How to Pick a Random Prime How to pick a random prime in the range {1, 2,, M}? How to pick a random integer X? Pick a uniformly random

More information

Interval Selection in the streaming model

Interval Selection in the streaming model Interval Selection in the streaming model Pascal Bemmann Abstract In the interval selection problem we are given a set of intervals via a stream and want to nd the maximum set of pairwise independent intervals.

More information

12 Count-Min Sketch and Apriori Algorithm (and Bloom Filters)

12 Count-Min Sketch and Apriori Algorithm (and Bloom Filters) 12 Count-Min Sketch and Apriori Algorithm (and Bloom Filters) Many streaming algorithms use random hashing functions to compress data. They basically randomly map some data items on top of each other.

More information

Leveraging Big Data: Lecture 13

Leveraging Big Data: Lecture 13 Leveraging Big Data: Lecture 13 http://www.cohenwang.com/edith/bigdataclass2013 Instructors: Edith Cohen Amos Fiat Haim Kaplan Tova Milo What are Linear Sketches? Linear Transformations of the input vector

More information

A Tight Lower Bound for High Frequency Moment Estimation with Small Error

A Tight Lower Bound for High Frequency Moment Estimation with Small Error A Tight Lower Bound for High Frequency Moment Estimation with Small Error Yi Li Department of EECS University of Michigan, Ann Arbor leeyi@umich.edu David P. Woodruff IBM Research, Almaden dpwoodru@us.ibm.com

More information

The one-way communication complexity of the Boolean Hidden Matching Problem

The one-way communication complexity of the Boolean Hidden Matching Problem The one-way communication complexity of the Boolean Hidden Matching Problem Iordanis Kerenidis CRS - LRI Université Paris-Sud jkeren@lri.fr Ran Raz Faculty of Mathematics Weizmann Institute ran.raz@weizmann.ac.il

More information

Part 1: Hashing and Its Many Applications

Part 1: Hashing and Its Many Applications 1 Part 1: Hashing and Its Many Applications Sid C-K Chau Chi-Kin.Chau@cl.cam.ac.u http://www.cl.cam.ac.u/~cc25/teaching Why Randomized Algorithms? 2 Randomized Algorithms are algorithms that mae random

More information