The space complexity of approximating the frequency moments

Similar documents
Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Lecture 2: Streaming Algorithms

Randomized algorithm

Lecture 01 August 31, 2017

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)

Advanced Algorithm Design: Hashing and Applications to Compact Data Representation

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Lecture 9

Some notes on streaming algorithms continued

14.1 Finding frequent elements in stream

Lecture 2. Frequency problems

Chapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability

Lecture 6 September 13, 2016

Lecture 1: Introduction to Sublinear Algorithms

Lecture 5: Two-point Sampling

Declaring Independence via the Sketching of Sketches. Until August Hire Me!

CSE548, AMS542: Analysis of Algorithms, Spring 2014 Date: May 12. Final In-Class Exam. ( 2:35 PM 3:50 PM : 75 Minutes )

Quantum query complexity of entropy estimation

Computing the Entropy of a Stream

Expectation of geometric distribution

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

CS 591, Lecture 9 Data Analytics: Theory and Applications Boston University

Expectation of geometric distribution. Variance and Standard Deviation. Variance: Examples

18.10 Addendum: Arbitrary number of pigeons

Non-Interactive Zero Knowledge (II)

CS Communication Complexity: Applications and New Directions

Randomized algorithms. Inge Li Gørtz

Twelfth Problem Assignment

Big Data. Big data arises in many forms: Common themes:

Sliding Windows with Limited Storage

Streaming and communication complexity of Hamming distance

CSE548, AMS542: Analysis of Algorithms, Fall 2016 Date: Nov 30. Final In-Class Exam. ( 7:05 PM 8:20 PM : 75 Minutes )

CS Introduction to Complexity Theory. Lecture #11: Dec 8th, 2015

1 Estimating Frequency Moments in Streams

CSCI8980 Algorithmic Techniques for Big Data September 12, Lecture 2

Lecture notes on OPP algorithms [Preliminary Draft]

2 How many distinct elements are in a stream?

Lecture 3 Sept. 4, 2014

6.842 Randomness and Computation Lecture 5

Frequency Estimators

Lecture 4: Two-point Sampling, Coupon Collector s problem

Lecture 2: Minimax theorem, Impagliazzo Hard Core Lemma

Semester , Example Exam 1

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example

Expectation, inequalities and laws of large numbers

Lecture 7: Chapter 7. Sums of Random Variables and Long-Term Averages

Data Stream Methods. Graham Cormode S. Muthukrishnan

Simultaneous Communication Protocols with Quantum and Classical Messages

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Notes 6 : First and second moment methods

Lecture 3 Frequency Moments, Heavy Hitters

Polynomial Identity Testing and Circuit Lower Bounds

Lecture 4. P r[x > ce[x]] 1/c. = ap r[x = a] + a>ce[x] P r[x = a]

Sublinear Algorithms for Big Data

Approximate Counting and Markov Chain Monte Carlo

Tail Inequalities. The Chernoff bound works for random variables that are a sum of indicator variables with the same distribution (Bernoulli trials).

Analysis of Algorithms I: Perfect Hashing

Lecture 16: Communication Complexity

Concentration inequalities and the entropy method

Essentials on the Analysis of Randomized Algorithms

7 Algorithms for Massive Data Problems

CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms

CSE 312 Final Review: Section AA

Testing k-wise Independence over Streaming Data

IB Mathematics HL Year 2 Unit 7 (Core Topic 6: Probability and Statistics) Valuable Practice

Kousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13

ECE302 Spring 2015 HW10 Solutions May 3,

Two hours. Statistical Tables to be provided THE UNIVERSITY OF MANCHESTER. 14 January :45 11:45

EECS 70 Discrete Mathematics and Probability Theory Fall 2015 Walrand/Rao Final

Algorithms for Distributed Functional Monitoring

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems

Lecture 4: Hashing and Streaming Algorithms

CS 598CSC: Algorithms for Big Data Lecture date: Sept 11, 2014

1 Approximate Counting by Random Sampling

Math 180A. Lecture 16 Friday May 7 th. Expectation. Recall the three main probability density functions so far (1) Uniform (2) Exponential.

1 Exercises for lecture 1

1 Variance of a Random Variable

Notes on the second moment method, Erdős multiplication tables

Bell-shaped curves, variance

Poisson approximations

Solution: First we need to find the mean of this distribution. The mean is. ) = e[( e 1 e 1 ) 0] = 2.

CS 580: Algorithm Design and Analysis

Data Streams & Communication Complexity

Randomized Load Balancing:The Power of 2 Choices

Nondeterminism LECTURE Nondeterminism as a proof system. University of California, Los Angeles CS 289A Communication Complexity

Algorithms 6.5 REDUCTIONS. designing algorithms establishing lower bounds classifying problems intractability

Lower Bounds for Number-in-Hand Multiparty Communication Complexity, Made Easy

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.

Lecture Lecture 3 Tuesday Sep 09, 2014

ECEN 689 Special Topics in Data Science for Communications Networks

CS 361: Probability & Statistics

Lecture 9: March 26, 2014

Topics in Probabilistic Combinatorics and Algorithms Winter, Basic Derandomization Techniques

variance of independent variables: sum of variances So chebyshev predicts won t stray beyond stdev.

Chernoff Bounds. Theme: try to show that it is unlikely a random variable X is far away from its expectation.

A proof of the existence of good nested lattices

Week 12-13: Discrete Probability

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

On the method of typical bounded differences. Lutz Warnke. Georgia Tech

Lecture 1 September 3, 2013

Introduction to discrete probability. The rules Sample space (finite except for one example)

Transcription:

The space complexity of approximating the frequency moments Felix Biermeier November 24, 2015 1

Overview Introduction Approximations of frequency moments lower bounds 2

Frequency moments Problem Estimate F k = n mi k i=1 for k N in sublinear space m i summary of a data set m i ˆ= # occurrence of item i F 0 ˆ= # distinct values F 1 ˆ= length of stream F 2 ˆ= repeat rate 1 2...... n item i 3

Case k 0 Theorem There exists a randomized algorithm that computes, given a sequence A = (a 1,..., a m ) of members of N = {1,..., n} in one pass and using O ( n(log n + log m) ) memory bits, a number Y such that Pr [ Y F k λf k ] 1 ε 4

Basic idea median of means define random variables Y i such that expected value is F k variance is relatively small apply Chebyshev and Chernoff Algorithm Estimate F k 1: for all i do 2: for all j do 3: compute X ij 4: Y i average of all X ij 5: output median of ally i 5

Proof - Preconditions Given a sequence A = (a 1,..., a m ), a i [n] random variables Y 1,..., Y s2 random variables X 1,..., X s1, i.i.d. compute X ij in O(log n + log m) space Algorithm Estimate F k for all i do for all j do compute X ij Y i average of all X ij output median of ally i 6

Proof - Computation of X choose an index p {1,..., m} uniformly at random track a p in A subsequently set r ˆ= # occurrence of a p for q p define X = m ( r k (r 1) k) Sequence A : a 1 a 2... a p... a m 1 a m 7

Proof - Expectation of X E[X] = n m i i=1 j=1 ( 1 ( ) ) m(j k (j 1) k ) m 8

Proof - Expectation of X E[X] = n m i i=1 j=1 ( 1 ( ) ) m(j k (j 1) k ) m Consider a fixed item i 1 k + (2 k 1 k ) + + ((m 1 1) k (m 1 2) k ) + (m k 1 (m 1 1) k ) 8

Proof - Expectation of X E[X] = n m i i=1 j=1 ( 1 ( ) ) m(j k (j 1) k ) m Consider a fixed item i 1 k + (2 k 1 k ) + + ((m 1 1) k (m 1 2) k ) + (m k 1 (m 1 1) k ) 8

Proof - Expectation of X E[X] = n m i i=1 j=1 ( 1 ( ) ) m(j k (j 1) k ) m Consider a fixed item i 1 k + (2 k 1 k ) + + ((m 1 1) k (m 1 2) k ) + (m k 1 (m 1 1) k ) =m k i 8

Proof - Expectation of X E[X] = n m i i=1 j=1 ( 1 ( ) ) m(j k (j 1) k ) m Therefore E[X] = m k 1 + + mk n = n i=1 m k i = F k 8

Proof - Variance of X Consider the definition Var[X] = E[X 2 ] E[X] 2 Similar to last slide E[X 2 ] kf 1 F 2k 1 9

Proof - Variance of Y Observation [ ] 1 s 1 E[Y ] = E X i = 1 s 1 E [X i ] = F k = E[X] s 1 s 1 i=1 i=1 10

Proof - Variance of Y Observation [ ] 1 s 1 E[Y ] = E X i = 1 s 1 E [X i ] = F k = E[X] s 1 s 1 i=1 i=1 Therefore Var[Y ] kn1 1/k F 2 k s 1 10

Proof - Probability for a single Y Keep in mind E[Y ] = E[X] = F k Apply Chebyshev s Inequality Pr[ Y F k > λf k ] Var[Y ] (λf k ) 2 1 8 11

Proof - Probability for the median of all Y i Define a bad event Therefore Z i = 1 Y i F k > λf k s 2 Z = i=1 Z i s 2 E[Z ] = E[Z i ] s 2 8 i=1 12

Proof - Probability for the median of all Y i Define a bad event Therefore Z i = 1 Y i F k > λf k s 2 Z = i=1 Z i s 2 E[Z ] = E[Z i ] s 2 8 i=1 By choosing δ = 3 and µ = s 2 /8 Chernoff Bound supplies [ Pr Z s 2 2 ] ε 27 32 ln(2) ε, 0 < ε < 1 12

Case k = 2 Theorem There exists a randomized algorithm that computes, given a sequence A = (a 1,..., a m ) of members of N = {1,..., n} in one pass and using O (log n + log m) memory bits, a number Y such that Pr [ Y F 2 λf 2 ] 1 ε 13

Basic idea similar structure to the proof before linear sketch use four-wise independent random variables ε i space complexity before: O ( n(log n + log m) ) now: O (log n + log m) ( n X = i=1 ε i m i ) 2, ε i { 1, 1} 14

Necessity of randomization Proposition For any nonnegative integer k 1, any deterministic algorithm that outputs, given a sequence A of n/2 elements of N = {1,..., n}, a number Y such that must use Ω(n) memory bits. Y F k 0.1F k 15

Basic idea specific family of subsets of N two different input sequences compare memory configurations apply pigeon-hole principle memory A(G 2, G 1 ) A(G 1, G 1 ) 16

F Definition F = max 1 i n m i Theorem Any randomized algorithm that outputs, given a sequence A of at most 2n elements of N = {1,..., n} a number Y such that Pr[ Y F F /3] 1 ε for some fixed ε < 1/2 must use Ω(n) memory bits. 17

Basic idea Disjointness problem DIS n (x, y) boolean function given set N = {1,..., n} two players with input x resp. y x, y {0, 1} n characterize subsets N x, N y of N output 1 iff N x N y Reduce DIS n to F = max 1 i n m i lower bound for DIS n is known define communication protocol to compute DIS n 18

Case k > 5 Theorem For any fixed k > 5 and δ < 1/2, any randomized algorithm that outputs, given an input sequence A of at most n elements of N = {1,..., n} a number Z k such that Pr[ Z k F k 0.1F k ] 1 δ uses at least Ω(n 1 5/k ) memory bits. 19

Yao s Minimax Principle expected cost of randomized algorithm (worst-case) cost of best deterministic algorithm against a certain distribution of inputs Here: lower bound for randomized algorithm show that no deterministic algorithm performs well against inputs under a certain distribution 20