CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Lecture 9

Size: px
Start display at page:

Download "CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Lecture 9"

Transcription

1 CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Lecture 9 Allan Borodin March 13, / 33

2 Lecture 9 Announcements 1 I have the assignments graded by Lalla. 2 I have now posted five questions for Assignment 2 and the assignment is now complete. The due date is March 27. Todays agenda 1 Sublinear time and space algorithms. 2 / 33

3 Sublinear time and sublinear space algorithms We now consider contexts in which randomization is provably more essential. In particular, we will study sublinear time algorithms and then the (small space) streaming model. An algorithm is sublinear time if its running time is o(n), where n is the length of the input. As such an algorithm must provide an answer without reading the entire input. Thus to achieve non-trivial tasks, we almost always have to use randomness in sublinear time algorithms to sample parts of the inputs and/or to randomly keep snapshots of what we have seen. The subject of sublinear time and sublinear space algorithms are extensive topics and we will only present a very small selection. The general flavour of sublinear time results will be a tradeoff between the accuracy of the solution and the time bound. There is some relation between this topic and distributed local algorithms. Sublinear space algorithm (e.g. streaming algorithms) have some relation to online algorithms. These topics will take us beyond search and optimization problems. 3 / 33

4 A deterministic exception: estimating the diameter in a finite metric space We first conisder an exception of a sublinear time algorithm that does not use randomization. (Comment: sublinear in a weak sense.) Suppose we are given a finite metric space M (with say n points x i ) where the input is given as n 2 distance values d(x i, x j ). The problem is to compute the diameter D of the metric space, that is, the maximum distance between any two points. For this maximum diameter problem, there is a simple O(n) time (and hence sublinear in n 2, the number of distances) algorithm; namely, choose an arbitrary point x M and compute D = max j d(x, x j ). By the triangle inequality, D is a 2-approximation of the diameter. I say sublinear time in a weak sense because in an implicitly represented distance function (such as d dimensional Euclidean space), the points could be explicitly given as inputs and then the input size is n and not n 2. 4 / 33

5 Sampling the inputs: some examples The goal in this area is to minimize execution time while still being able to produce a reasonable answer with sufficiently high probability. Recall that by independent trials, we can reduce the probability of error. We will consider the following examples: 1 Finding an element in an sorted (doubly) linked list [Chazelle,Liu,Magen] 2 Estimating the average degree in a graph [Feige 2006] 3 Estimating the size of some maximal (and maximum) matching [Nguyen and Onak 2008] in bounded degree graphs. 4 Examples of property testing, a major topic within the area of sublinear time algorithms. See Dana Ron s DBLP for many results and surveys. 5 In many cases, the algorithms will be simple or reasonably natural but the analysis might be quite non-trivial. 5 / 33

6 Finding an element in a sorted list of distinct elements. Suppose we have an array A[i] for 1 i n where each A[i] is a triple (x i, p i, s i ) where the {p i, s i } constitute a doubly linked list. That is, p i = j : argmax{j x j < x i } if such an x j exists and similarly q i = argminj x j > x i }. We would like to determine if a given value x occurs in a doubly linked list and if so, output the index j such that x = x j. A n algorithm for searching in a sorted linked list Let R = {j i 1 i n} be n randomly chosen indices. Access these {A[j i ]} to determine the predecessor and successor of x amongst these randomly chosen elements of the list. (There may not be both a predecssor and successor.) Then (alternately) do a brute force linked search (or resp. search for n steps) in both directions of the linked list to determine whether or not x k exists. 6 / 33

7 Finding an element in a sorted list (continued) Claim: This is a zero sided (resp, one-sided error algorithm) that runs in expected time O( n) (resp. has constant probability of not find x if it exists). Using the Yao principle this expected time can be shown to be optimal. The same can be done for a singly linked list if the list is anchored ; i.e., we have the index of the smallest element in the list. Similar results were shown by Chazelle, Liu and Magen for various geometric problems such as determining whether or not two convex polygons (represented by doubly linked lists of the vertices) intersect. Note that most sublinear time algorithms are either randomized 1-sided or 2-sided error algorithms and not 0-sided algorithms that always compute a correct answer but whose running time is bounded in expectation. 7 / 33

8 Estimating average degree in a graph Given a graph G = (V, E) with V = n, we want to estimate the average degree d of the vertices. We can assume that G is connected and hence there are at least n 1 edges. We want to construct an algorithm that approximates the average degree within a factor less than (2 + ɛ) with probability at least 3/4 in time O( n poly(ɛ) ). We will assume that we can access the degree d i of any vertex v i in one step. Again, we note that like a number of results in this area, the algorithm is simple but the analysis requires some care. The (simplified) Feige algorithm Czumaj and Sohler survey. Sample 8/ɛ random subsets S i of V each of size (say) Compute the average degree a i of nodes in each S i. The output is the minimum of these {a i }. n ɛ 3 8 / 33

9 The analysis of the approximation Since we are sampling subsets to estimate the average degree, we might have estimates that are too low or too high. But it can be shown that with high probability these estimates will not be too bad. More precisely, we need: 1 Lemma 1: Prob[a i < 1 2 (1 ɛ) d] ɛ 64 2 Lemma 2: Prob[a i > (1 + ɛ) d] 1 ɛ 2 The probability bound in Lemma 2 follows from the Markov inequality which is then amplified as usual by the repeated 8/ɛ trials so that the probability that all of the a i are bigger than (1 + ɛ) d is at most (1 ɛ/2) 8/ɛ = (1 1/t) 4t (1/e) 4t letting t = 2/ɛ. 9 / 33

10 The analysis of the average degree (continued) From Lemma 1, we fall outside the desired bound if any of the repeated trials gives a very small estimate of the average degree but by the union bound this is no worse than the sum of the probabilities for each trial. It remains to sketch a proof of Lemma 1. Let H be the set of the ɛn highest degree vertices in V and L = V \ H. Then v L d v ( 1 2 ɛ) v V d v since there can be at most ɛ n edges within H and every edge adjacent to L contribues at least 1 to the sum of degrees of vertices adjacent to L and thereare at least n 1 edges. The 1 2 is becasue we are possibly double counting the contribution of edges within L. For a lower bound on the average degree of vertices in a sample set S, the worst case is if all the sampled vertices are in L. 10 / 33

11 Conclusion of proof sketch for Lemma 1 Let X j be the random variable corresonding to the ith sampled vertex in a sampled set S where each such S has size s. By Hoeffding s generalization of the Chernoff bound, we have Prob[(1/s)( s X j (1 ɛ)(1/ L )[ deg(v)] v L j is exponentially small; that is, the probability that the average degree in a sampled set is (1 ɛ) less than the average degree in L is exponentially small. But the average degree of a vertex in L is at least (1/2 ɛ) times the average degree in the graph so that being less than (1/2 ɛ) the average degree is exponentially small. Feige s more detailed analysis shows that a (2 + ɛ) approximation can be obtained using time (i.e., queries) O( n/d 0 /ɛ) for graphs with average degree at least d / 33

12 Understanding the input query model As we initially noted, sublinear time algorithms almost invariably sample (i.e. query) the input in some way. The nature of these queries will clearly influence what kinds of results can be obtained. Feige s [2006] algorithm for estimating the average degree uses only degree queries ; that is, what is the degree of a vertex v. Feige shows that in this degree query model, any (randomized) algorithm that acheives a (2 ɛ) approximation requires Ω(n) queries. 12 / 33

13 Understanding the input query model As we initially noted, sublinear time algorithms almost invariably sample (i.e. query) the input in some way. The nature of these queries will clearly influence what kinds of results can be obtained. Feige s [2006] algorithm for estimating the average degree uses only degree queries ; that is, what is the degree of a vertex v. Feige shows that in this degree query model, any (randomized) algorithm that acheives a (2 ɛ) approximation requires Ω(n) queries. In contrast, Goldreich and Ron [2008] consider the same average degree problem in the neighbour query model; that is, upon a query (v, j), the query oracle returns the j th neighbour of v or a special symbol indicating that v has degree less than j. A degree query can be simulated by log n neighbour queries. Goldreich and Ron show that in the neighbour query model, that the average degree d can be (1 + ɛ) approximated (with one sided error probability 2/3) in time O( n/poly(log n, 1 ɛ )) They show that Ω( (n/ɛ)) queries is necessary to achieve a (1 + ɛ) approximation. in this neighbourhood model. 12 / 33

14 Approximating the size of a maximum matching in a bounded degree graph We recall that the size of any maximal matching is within a factor of 2 of the size of a maximum matching. Let m be smallest possible maximal matching. Our goal is to compute with high probability a maximal matching in time depending only on the maximum degree D. Nguyen and Onak Algorithm Choose a random permutation p of the edges {e j } % Note: this will be done on the fly as needed The permutation determines a maximal matching M as given by the greedy algorithm that adds an edge whenever possible. Choose r = O(D/ɛ 2 ) nodes {v i } at random Using an oracle let X i be the indicator random variable for whether or not vertex v i is in the maximal matching. Output m = i=1...r X i 13 / 33

15 Performance and time for the maximal matching Claims 1 m m m + ɛ n where m = M. 2 The algorithm runs in time 2 O(D) /ɛ 2 This immediately gives an approximation of the maximum matching m such that m m 2m + ɛn A more involved algorithm by Nguyen and Onak yields the following result: Nguyen and Onak maximum matching result Let δ, ɛ > 0 and let k = 1/δ. There is a randomized one sided algorithm (with probability 2/3) running in time 2O(Dk ) that outputs a maximium ɛ 2k+1 matching estimate m such that m m (1 + δ)m + ɛn. 14 / 33

16 Property Testing Perhaps the most prevalent and useful aspect of sublinear time algorithms is for the concept of property testing. This is its own area of research with many results. Here is the concept: Given an object G (e.g. a function, a graph), test whether or not G has some property P (e.g. G is bipartite) or is in some sense far away from that property. The tester determines with sufficiently high probability (say 2/3) if G has the property or is ɛ-far from having the property. The tester can answer either way if G does not have the property but is ɛ-close to having the property. We will usually have a 1-sided error in that we will always answer YES if G has the property. We will see what it means to be ɛ-far (or close) from a property by some examples. See also question 5 in assignment / 33

17 Tester for linearity of a function Let f : Z n > Z n ; f is linear if x, y f (x + y) = f (x) + f (y). Note: this will really be a test for group homomorphism f is said to be ɛ-close to linear if its values can be changed in at most a fraction ɛ of the function domain arguments (i.e. at most ɛn elements of Z n ) so as to make it a linear function. Otherwise f is said to be ɛ-far from linear. The tester Repeat 4/ɛ times Choose x, y Z n at random If f (x) + f (y) f (x + y) then Output f is not linear End Repeat If all these 4/ɛ tests succeed then Output linear Clearly if f is linear, the tester says linear. For ɛ < 2/9, if f is ɛ-far from being linear then the probability of detecting this is at least 2/3. 16 / 33

18 Testing a list for monotonicity Given a list A[i] = x i, i = 1... n of distinct elements, determine if A is a monotone list (i.e. i < j A[i] < A[j]) or is ɛ-far from being monotone in the sense that more than ɛ n list values need to be changed in order for A to be monotone. The algorithm randomly chooses 2/ɛ random indices i and performs binary search on x i to determine if x i in the list. The algorithm reports that the list is monotone if and only if all binary searches succeed. Clearly the time bound is O(log n/ɛ) and clearly if A is monotone then the tester reports monotone. If A is ɛ-far from monotone, then the probability that a random binary search will succeed is at most (1 ɛ) and hence the probability of the algorithm failing to detect non-monotonicity is at most (1 ɛ) 2 ɛ 1 e 2 17 / 33

19 Graph Property testing Graph property testing is an area by itself. There are several models for testing graph properties. Let G = (V, E) with n = V and m = E. Dense model: Graphs represented by adjacency matrix. Say that graph is ɛ-far from having a property P if more than ɛn 2 matrix entries have to be changed so that graph has property P. Sparse model, bounded degree model: Graphs represented by vertex adjacency lists. Graph is ɛ-far from property P is at least ɛm edges have to be changed. In general there are substantially different results for these two graph models. 18 / 33

20 The property of being bipartite In the dense model, there is a constant time one-sided error tester. The tester is (once again) conceptually what one might expect but the analysis is not at all immediate. Goldreich, Goldwasser,Ron bipartite tester Pick a random subset S of vertices of size r = Θ( log( 1 ɛ ) ɛ 2 ) Output bipartite iff the induced subgraph is bipartite Clearly if G is bipartite then the algorithm will always say that it is bipartite. The claim is that if G is ɛ-far from being bipartite then the algorithm will say that it is not bipartite with probability at least 2/3. The algorithm runs in time proportional to the size of the induced subgraph (i.e. the time needed to create the induced subgraph). 19 / 33

21 Testing bipartiteness in the bounded degree model Even for degree 3 graphs, Ω( n) queries are required to test for being bipartite or ɛ-far from being being bipartite. Goldreich and Ron [1997] There is an algorithm that uses O( n poly(log n/ɛ)) queries. The algorithm is based on random walks in a graph and utilizes the fact that a graph is bipartite iff it has no odd length cycles. Goldreich and Ron [1999] bounded degree algorithm Repeat O(1/ɛ) times Randomly select a vertex s V If algorithm OddCycle(s) returns cylce found then REJECT End Repeat If case the algorithm did not already reject, then ACCEPT OddCycle performs poly(log n/ɛ) random walks from s each of length poly(log n/ɛ). If some vertex v is reached by both an even length and an odd length prefix of a walk then report cycle found; else report odd cycle not found 20 / 33

22 Sublinear space: A slight detour into complexity theory Sublinear space has been an important topic in complexity theory since the start of complexity theory. While not as important as the P = NP or NP = co NP question, there are two fundamental space questions that remain unresolved: 1 Is NSPACE(S) = DSPACE(S) for S log n? 2 Is P contained in DSPACE(log n) or k SPACE(log k n)? Equivalently, is P contained in polylogarthmic parallel time. Savitch [1969] showed a non deterministic S space bounded TM can be simulated by a deterministic S 2 space bounded machine (for space constructible bounds S). Further in what was considered a very surprising result, Immerman [1987] and independently Szelepcsényi [1987] NSPACE(S) = co NSPACE(S). (Savitch s result was also considered suprising by some researchers when it was announced.) 21 / 33

23 USTCON vs STCON We let USTCON (resp. STCON) denote the problem of deciding if there is a path from some specified source node s to some specified target node t in an unidrected (resp. directed) graph G. As previously mentioned the Aleliunas et al [1979] random walk result showed that USTCON is in RSPACE(log n) and after a sequence of partial results about USTCON, Reingold [2008] was eventually able to show that USTCON is in DSPACE(log n) It remains open if 1 STCON (and hence NSPACE(log n)) is in RSPACE(log n) or even DSPACE(log n). 2 STCON RSPACE(S) or even DSAPCE(S) for any S = o(log 2 n) 3 RSPACE(S) = DSPACE(S). 22 / 33

24 The streaming model In the data stream model, the input is a sequence A of inputs a 1,..., a m where say each a i {1, 2,..., n}; the stream is assumed to be too large to store in memory. We usually assume that m is not known and hence one can think of this model as a type of online or dynamic algorithm that is maintaining (say) current statistics. The space available S(m, n) is some sublinear function. The input streams by and one can only store information in space S. In some papers, space is measured in bits (which is what we will usually do) and sometimes in words, each word being O(log n) bits. It is also desirable that that each input is processed efficiently, say log(m + n) and perhaps even in time O(1) (assuming we are counting operations on words as O(1)). 23 / 33

25 The streaming model continued The initial (and primary) work in streaming algorithms is to approximately compute some function (say a statistic) of the data or identify some particular element(s) of the data stream. Lately, the model has been extended to consider semi-streaming algorithms for optimization problems. For example, for a graph problem such as matching for a graph G = (V, E), the goal is to obtain a good approximation using space Õ( V ) rather than O( E ). Most results concern the space required for a one pass algorithm. But there are other results concerning the tradeoff between the space and number of passes. 24 / 33

26 An example of a deterministic streaming algorithms As in sublinear time, it will turn out that almost all of the results in this area are for randomized algorithms. Here is one exception. The missing element problem Suppose we are given a stream A = a 1,..., a n 1 and we are promised that the stream A is a permutation of {1,..., n} {x} for some integer x in [1, n]. The goal is to compute the missing x. Space n is obvious using a bit vector c j = 1 iff j has occured. Instead we know that j A = n(n + 1)/2 x. So if s = i A a i, then x = n(n + 1)/2 s. This uses only 2 log n space and constant time/item. 25 / 33

27 Generalizing to k missing elements Now suppose we are promised a stream A of length n k whose elements consist of a permutation of n k distinct elements in {1,..., n}. We want to find the missing k elements. Generalizing the one missing element solution, to the case that there are k missing elements we can (for example) maintain the sum of j th powers (1 j k) s j = i A (a i) j = c j (n) i / A x j i. Here c j(n) is the closed form expression for n i=1 i j. This results in k equations in k unknowns using space k 2 log n but without an efficient way to compute the solution. As far as I know there may not be an efficient small space streaming algorithm for this problem. Using randomization, much more efficient methods are known; namely, there is a streaming alg with space and time/item O(k log k log n); it can be shown that Ω(k log(n/k)) space is necessary. 26 / 33

28 Some well-studied streaming problems Computing frequency moments. Let A = a 1... a m be a data stream with a i [n] = {1, 2,... n}. Let m i denote the number of occurences of the value i in the stream A. For k 0, the k th frequency moment is F k = i [n] (m i) k. The frequency moments are most often studied for integral k. 1 F 1 = m, the length of the sequence which can be simply computed. 2 F 0 is the number of distinct elements in the stream 3 F 2 is a special case of interest called the repeat index (also known as Ginis homogeneity index). Finding k-heavy hitters; i.e. those elements appearing at least n/k times in stream A. Finding rare or unique elements in A. 27 / 33

29 What is known about computing F k? Given an error bound ɛ and confidence bound δ, the goal in the frequency moment problem is to compute an estimate F k such that Prob[ F k F k > ɛf k] δ. The seminal paper in this regard is by Alon, Matias and Szegedy (AMS) [1999]. AMS establish a number of results: 1 For k 3, there is an Õ(m 1 1/k ) space algorithm. The Õ notation hides factors that are polynomial in 1 ɛ and polylogarithmic in m, n, 1 δ. 2 For k = 0 and every c > 2, there is an O(log n) space algorithm computing F 0 such that Prob[(1/c)F 0 F 0 cf 0 does not hold] 2/c. 3 For k = 1, log n is obvious to exactly compute the length but an estimate can be obtained with space O(log log n + 1/ɛ) 4 For k = 2, they obtain space Õ(1) = O( log(1/δ ɛ )(log n + log m)) 2 5 They also show that for all k > 5, there is a (space) lower bound of Ω(m 1 5/k ). 28 / 33

30 Results following AMS A considerable line of research followed this seminal paper. Notably settling conjectures in AMS: The following results apply to real as well as integral k. 1 An Ω(m 1 2/k ) space lower bound for all k > 2 (Bar Yossef et al [2002]). 2 Indyk and Woodruff [2005] settle the space bound for k > 2 with a matching upper bound of Õ(m1 2/k ) The basic idea behind these randomized approximation algorithms is to define a random variable Y whose expected value is close to F k and variance is sufficiently small such that this r.v. can be calculated under the space constraint. We will just sketch the (non optimal) AMS results for F k for k > 2 and the result for F / 33

31 The AMS F k algorithm Let s 1 = ( 8 ɛ 2 m 1 1 k )/δ 2 and s 2 = 2 log 1 δ. AMS algorithm for F k The output Y of the algorithm is the median of s 2 random variables Y 1, Y 2,..., Y s2 where Y i is the mean of s 1 random variables X ij, 1 j s 1. All X ij are independent identically distributed random variables. Each X = X ij is calculated in the same way as follows: Choose random p [1,..., m], and then see the value of a p. Maintain r = {q q p and a q = a p }. Define X = m(r k (r 1) k ). Note that in order to calculate X, we only require storing a p (i.e. log n bits) and r (i.e. at most log m bits). Hence the Each X = X ij is calculated in the same way using only O(log n + log n) bits. For simplicity we assume the input stream length m is known but it can be estimated and updated as the stream unfolds. We need to show that E[X ] = F k and that the variance Var[X ] is small enough so as to use the Chebyshev inequality to show that Prob[ Y i F k > ɛf k is small. 30 / 33

32 AMS analysis sketch Showing E[X ] = F k. (by telescoping) m m [(1k + (2 k 1 k ) (m k 1 (m 1 1) k ))+ (1 k + (2 k 1 k ) (m k 2 (m 2 1) k )) (1 k + (2 k 1 k ) (m k n (m n 1) k ))] = n i m k i = F k 31 / 33

33 AMS analysis continued Y is the median of the Y i. It is a standard probabilistic idea that the median Y of identical r.v.s Y i (each having constant probability of small deviation from their mean F k ) implies that Y has a high probability of having a small deviation from this mean. E[Y i ] = E[X ] and Var[Y i ] Var[X ]/s 1 E[X 2 ]/s 1. The result needed is that Prob[ Y i F k > ɛf k ] 1 8 The Y i values are an average of independent X = X ij variables but they can take on large vales so that instead of Chernoff bounds, AMS use the Chebyshev inequality: Prob[ Y E[Y ] > ɛe[y ]] Var[Y ] ɛ 2 E[Y ] It remains to show that E[X 2 ] kf 1 F 2k 1 and that F 1 F 2k 1 n 1 1/k F 2 k 32 / 33

34 Sketch of F 2 improvement They again take the median of s 2 = 2 log( 1 δ ) random variables Y i but now each Y i will be the sum of only a constant number s 1 = 16 of ɛ 2 identically distibuted X = X ij. The key additional idea is that X will not maintain a count for each particular value separately but rather will count an appropriate sum Z = n t=1 b tm t and set X = Z 2. Here is how the vector < b 1,..., b n > { 1, 1} n is randomly chosen. Let V = {v 1,..., v h } be a set of O(n 2 ) vectors over { 1, 1} where each vector v p =< v p,1,..., v p,n > V is a 4-wise independent vector of length n. Then p is selected uniformly in {1,..., h} and < b 1,..., b n > is set to v p. 33 / 33

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Lecture 10 Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Recap Sublinear time algorithms Deterministic + exact: binary search Deterministic + inexact: estimating diameter in a

More information

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Allan Borodin March 26, 2015; Lecture 11 1 / 30 Annoucements and Todays Agenda Announcements 1 I added one more question to assignment 3 and (upon

More information

The space complexity of approximating the frequency moments

The space complexity of approximating the frequency moments The space complexity of approximating the frequency moments Felix Biermeier November 24, 2015 1 Overview Introduction Approximations of frequency moments lower bounds 2 Frequency moments Problem Estimate

More information

Lower Bounds for Testing Bipartiteness in Dense Graphs

Lower Bounds for Testing Bipartiteness in Dense Graphs Lower Bounds for Testing Bipartiteness in Dense Graphs Andrej Bogdanov Luca Trevisan Abstract We consider the problem of testing bipartiteness in the adjacency matrix model. The best known algorithm, due

More information

Linear Sketches A Useful Tool in Streaming and Compressive Sensing

Linear Sketches A Useful Tool in Streaming and Compressive Sensing Linear Sketches A Useful Tool in Streaming and Compressive Sensing Qin Zhang 1-1 Linear sketch Random linear projection M : R n R k that preserves properties of any v R n with high prob. where k n. M =

More information

Sublinear Algorithms Lecture 3. Sofya Raskhodnikova Penn State University

Sublinear Algorithms Lecture 3. Sofya Raskhodnikova Penn State University Sublinear Algorithms Lecture 3 Sofya Raskhodnikova Penn State University 1 Graph Properties Testing if a Graph is Connected [Goldreich Ron] Input: a graph G = (V, E) on n vertices in adjacency lists representation

More information

Lecture 3. 1 Terminology. 2 Non-Deterministic Space Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, 2005.

Lecture 3. 1 Terminology. 2 Non-Deterministic Space Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, 2005. Notes on Complexity Theory: Fall 2005 Last updated: September, 2005 Jonathan Katz Lecture 3 1 Terminology For any complexity class C, we define the class coc as follows: coc def = { L L C }. One class

More information

Computing the Entropy of a Stream

Computing the Entropy of a Stream Computing the Entropy of a Stream To appear in SODA 2007 Graham Cormode graham@research.att.com Amit Chakrabarti Dartmouth College Andrew McGregor U. Penn / UCSD Outline Introduction Entropy Upper Bound

More information

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 9/6/2004. Notes for Lecture 3

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 9/6/2004. Notes for Lecture 3 U.C. Berkeley CS278: Computational Complexity Handout N3 Professor Luca Trevisan 9/6/2004 Notes for Lecture 3 Revised 10/6/04 1 Space-Bounded Complexity Classes A machine solves a problem using space s(

More information

Randomness and Computation March 13, Lecture 3

Randomness and Computation March 13, Lecture 3 0368.4163 Randomness and Computation March 13, 2009 Lecture 3 Lecturer: Ronitt Rubinfeld Scribe: Roza Pogalnikova and Yaron Orenstein Announcements Homework 1 is released, due 25/03. Lecture Plan 1. Do

More information

CSC 5170: Theory of Computational Complexity Lecture 4 The Chinese University of Hong Kong 1 February 2010

CSC 5170: Theory of Computational Complexity Lecture 4 The Chinese University of Hong Kong 1 February 2010 CSC 5170: Theory of Computational Complexity Lecture 4 The Chinese University of Hong Kong 1 February 2010 Computational complexity studies the amount of resources necessary to perform given computations.

More information

An Introduction to Randomized algorithms

An Introduction to Randomized algorithms An Introduction to Randomized algorithms C.R. Subramanian The Institute of Mathematical Sciences, Chennai. Expository talk presented at the Research Promotion Workshop on Introduction to Geometric and

More information

Notes on Space-Bounded Complexity

Notes on Space-Bounded Complexity U.C. Berkeley CS172: Automata, Computability and Complexity Handout 7 Professor Luca Trevisan April 14, 2015 Notes on Space-Bounded Complexity These are notes for CS278, Computational Complexity, scribed

More information

Notes on Space-Bounded Complexity

Notes on Space-Bounded Complexity U.C. Berkeley CS172: Automata, Computability and Complexity Handout 6 Professor Luca Trevisan 4/13/2004 Notes on Space-Bounded Complexity These are notes for CS278, Computational Complexity, scribed by

More information

Lecture 1: 01/22/2014

Lecture 1: 01/22/2014 COMS 6998-3: Sub-Linear Algorithms in Learning and Testing Lecturer: Rocco Servedio Lecture 1: 01/22/2014 Spring 2014 Scribes: Clément Canonne and Richard Stark 1 Today High-level overview Administrative

More information

Lecture 5: January 30

Lecture 5: January 30 CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 5: January 30 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They

More information

Big Data. Big data arises in many forms: Common themes:

Big Data. Big data arises in many forms: Common themes: Big Data Big data arises in many forms: Physical Measurements: from science (physics, astronomy) Medical data: genetic sequences, detailed time series Activity data: GPS location, social network activity

More information

14.1 Finding frequent elements in stream

14.1 Finding frequent elements in stream Chapter 14 Streaming Data Model 14.1 Finding frequent elements in stream A very useful statistics for many applications is to keep track of elements that occur more frequently. It can come in many flavours

More information

Lecture 3 Frequency Moments, Heavy Hitters

Lecture 3 Frequency Moments, Heavy Hitters COMS E6998-9: Algorithmic Techniques for Massive Data Sep 15, 2015 Lecture 3 Frequency Moments, Heavy Hitters Instructor: Alex Andoni Scribes: Daniel Alabi, Wangda Zhang 1 Introduction This lecture is

More information

Lecture 2 Sep 5, 2017

Lecture 2 Sep 5, 2017 CS 388R: Randomized Algorithms Fall 2017 Lecture 2 Sep 5, 2017 Prof. Eric Price Scribe: V. Orestis Papadigenopoulos and Patrick Rall NOTE: THESE NOTES HAVE NOT BEEN EDITED OR CHECKED FOR CORRECTNESS 1

More information

Testing Expansion in Bounded-Degree Graphs

Testing Expansion in Bounded-Degree Graphs Testing Expansion in Bounded-Degree Graphs Artur Czumaj Department of Computer Science University of Warwick Coventry CV4 7AL, UK czumaj@dcs.warwick.ac.uk Christian Sohler Department of Computer Science

More information

P, NP, NP-Complete, and NPhard

P, NP, NP-Complete, and NPhard P, NP, NP-Complete, and NPhard Problems Zhenjiang Li 21/09/2011 Outline Algorithm time complicity P and NP problems NP-Complete and NP-Hard problems Algorithm time complicity Outline What is this course

More information

Lecture 01 August 31, 2017

Lecture 01 August 31, 2017 Sketching Algorithms for Big Data Fall 2017 Prof. Jelani Nelson Lecture 01 August 31, 2017 Scribe: Vinh-Kha Le 1 Overview In this lecture, we overviewed the six main topics covered in the course, reviewed

More information

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016 Lecture 1: Introduction and Review We begin with a short introduction to the course, and logistics. We then survey some basics about approximation algorithms and probability. We also introduce some of

More information

Data Streams & Communication Complexity

Data Streams & Communication Complexity Data Streams & Communication Complexity Lecture 1: Simple Stream Statistics in Small Space Andrew McGregor, UMass Amherst 1/25 Data Stream Model Stream: m elements from universe of size n, e.g., x 1, x

More information

Lecture 2: Streaming Algorithms

Lecture 2: Streaming Algorithms CS369G: Algorithmic Techniques for Big Data Spring 2015-2016 Lecture 2: Streaming Algorithms Prof. Moses Chariar Scribes: Stephen Mussmann 1 Overview In this lecture, we first derive a concentration inequality

More information

Oblivious and Adaptive Strategies for the Majority and Plurality Problems

Oblivious and Adaptive Strategies for the Majority and Plurality Problems Oblivious and Adaptive Strategies for the Majority and Plurality Problems Fan Chung 1, Ron Graham 1, Jia Mao 1, and Andrew Yao 2 1 Department of Computer Science and Engineering, University of California,

More information

1 Approximate Quantiles and Summaries

1 Approximate Quantiles and Summaries CS 598CSC: Algorithms for Big Data Lecture date: Sept 25, 2014 Instructor: Chandra Chekuri Scribe: Chandra Chekuri Suppose we have a stream a 1, a 2,..., a n of objects from an ordered universe. For simplicity

More information

Testing Graph Isomorphism

Testing Graph Isomorphism Testing Graph Isomorphism Eldar Fischer Arie Matsliah Abstract Two graphs G and H on n vertices are ɛ-far from being isomorphic if at least ɛ ( n 2) edges must be added or removed from E(G) in order to

More information

Probability Revealing Samples

Probability Revealing Samples Krzysztof Onak IBM Research Xiaorui Sun Microsoft Research Abstract In the most popular distribution testing and parameter estimation model, one can obtain information about an underlying distribution

More information

Lecture 1 : Probabilistic Method

Lecture 1 : Probabilistic Method IITM-CS6845: Theory Jan 04, 01 Lecturer: N.S.Narayanaswamy Lecture 1 : Probabilistic Method Scribe: R.Krithika The probabilistic method is a technique to deal with combinatorial problems by introducing

More information

Testing the Expansion of a Graph

Testing the Expansion of a Graph Testing the Expansion of a Graph Asaf Nachmias Asaf Shapira Abstract We study the problem of testing the expansion of graphs with bounded degree d in sublinear time. A graph is said to be an α- expander

More information

Notes 6 : First and second moment methods

Notes 6 : First and second moment methods Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative

More information

Lecture 1: Introduction to Sublinear Algorithms

Lecture 1: Introduction to Sublinear Algorithms CSE 522: Sublinear (and Streaming) Algorithms Spring 2014 Lecture 1: Introduction to Sublinear Algorithms March 31, 2014 Lecturer: Paul Beame Scribe: Paul Beame Too much data, too little time, space for

More information

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010 Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/2/200 Counting Problems Today we describe counting problems and the class #P that they define, and we show that every counting

More information

20.1 2SAT. CS125 Lecture 20 Fall 2016

20.1 2SAT. CS125 Lecture 20 Fall 2016 CS125 Lecture 20 Fall 2016 20.1 2SAT We show yet another possible way to solve the 2SAT problem. Recall that the input to 2SAT is a logical expression that is the conunction (AND) of a set of clauses,

More information

variance of independent variables: sum of variances So chebyshev predicts won t stray beyond stdev.

variance of independent variables: sum of variances So chebyshev predicts won t stray beyond stdev. Announcements No class monday. Metric embedding seminar. Review expectation notion of high probability. Markov. Today: Book 4.1, 3.3, 4.2 Chebyshev. Remind variance, standard deviation. σ 2 = E[(X µ X

More information

Quantum Algorithms for Graph Traversals and Related Problems

Quantum Algorithms for Graph Traversals and Related Problems Quantum Algorithms for Graph Traversals and Related Problems Sebastian Dörn Institut für Theoretische Informatik, Universität Ulm, 89069 Ulm, Germany sebastian.doern@uni-ulm.de Abstract. We study the complexity

More information

1 Randomized Computation

1 Randomized Computation CS 6743 Lecture 17 1 Fall 2007 1 Randomized Computation Why is randomness useful? Imagine you have a stack of bank notes, with very few counterfeit ones. You want to choose a genuine bank note to pay at

More information

Notes on MapReduce Algorithms

Notes on MapReduce Algorithms Notes on MapReduce Algorithms Barna Saha 1 Finding Minimum Spanning Tree of a Dense Graph in MapReduce We are given a graph G = (V, E) on V = N vertices and E = m N 1+c edges for some constant c > 0. Our

More information

1 Estimating Frequency Moments in Streams

1 Estimating Frequency Moments in Streams CS 598CSC: Algorithms for Big Data Lecture date: August 28, 2014 Instructor: Chandra Chekuri Scribe: Chandra Chekuri 1 Estimating Frequency Moments in Streams A significant fraction of streaming literature

More information

PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions. My T. Thai

PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions. My T. Thai PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions My T. Thai 1 1 Hardness of Approximation Consider a maximization problem Π such as MAX-E3SAT. To show that it is NP-hard to approximation

More information

CS 6820 Fall 2014 Lectures, October 3-20, 2014

CS 6820 Fall 2014 Lectures, October 3-20, 2014 Analysis of Algorithms Linear Programming Notes CS 6820 Fall 2014 Lectures, October 3-20, 2014 1 Linear programming The linear programming (LP) problem is the following optimization problem. We are given

More information

Logarithmic space. Evgenij Thorstensen V18. Evgenij Thorstensen Logarithmic space V18 1 / 18

Logarithmic space. Evgenij Thorstensen V18. Evgenij Thorstensen Logarithmic space V18 1 / 18 Logarithmic space Evgenij Thorstensen V18 Evgenij Thorstensen Logarithmic space V18 1 / 18 Journey below Unlike for time, it makes sense to talk about sublinear space. This models computations on input

More information

Quantum algorithms for testing properties of distributions

Quantum algorithms for testing properties of distributions Quantum algorithms for testing properties of distributions Sergey Bravyi, Aram W. Harrow, and Avinatan Hassidim July 8, 2009 Abstract Suppose one has access to oracles generating samples from two unknown

More information

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds Randomized Algorithms Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized

More information

1 Randomized complexity

1 Randomized complexity 80240233: Complexity of Computation Lecture 6 ITCS, Tsinghua Univesity, Fall 2007 23 October 2007 Instructor: Elad Verbin Notes by: Zhang Zhiqiang and Yu Wei 1 Randomized complexity So far our notion of

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analysis Handout Luca Trevisan August 9, 07 Scribe: Mahshid Montazer Lecture In this lecture, we study the Max Cut problem in random graphs. We compute the probable

More information

CSE200: Computability and complexity Space Complexity

CSE200: Computability and complexity Space Complexity CSE200: Computability and complexity Space Complexity Shachar Lovett January 29, 2018 1 Space complexity We would like to discuss languages that may be determined in sub-linear space. Lets first recall

More information

THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION

THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION YAQIAO LI In this note we will prove Szemerédi s regularity lemma, and its application in proving the triangle removal lemma and the Roth s theorem on

More information

Lecture 4: Two-point Sampling, Coupon Collector s problem

Lecture 4: Two-point Sampling, Coupon Collector s problem Randomized Algorithms Lecture 4: Two-point Sampling, Coupon Collector s problem Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized Algorithms

More information

Space Complexity vs. Query Complexity

Space Complexity vs. Query Complexity Space Complexity vs. Query Complexity Oded Lachish Ilan Newman Asaf Shapira Abstract Combinatorial property testing deals with the following relaxation of decision problems: Given a fixed property and

More information

Model Counting for Logical Theories

Model Counting for Logical Theories Model Counting for Logical Theories Wednesday Dmitry Chistikov Rayna Dimitrova Department of Computer Science University of Oxford, UK Max Planck Institute for Software Systems (MPI-SWS) Kaiserslautern

More information

Testing Lipschitz Functions on Hypergrid Domains

Testing Lipschitz Functions on Hypergrid Domains Testing Lipschitz Functions on Hypergrid Domains Pranjal Awasthi 1, Madhav Jha 2, Marco Molinaro 1, and Sofya Raskhodnikova 2 1 Carnegie Mellon University, USA, {pawasthi,molinaro}@cmu.edu. 2 Pennsylvania

More information

Lecture 13: 04/23/2014

Lecture 13: 04/23/2014 COMS 6998-3: Sub-Linear Algorithms in Learning and Testing Lecturer: Rocco Servedio Lecture 13: 04/23/2014 Spring 2014 Scribe: Psallidas Fotios Administrative: Submit HW problem solutions by Wednesday,

More information

Randomness and non-uniformity

Randomness and non-uniformity Randomness and non-uniformity JASS 2006 Course 1: Proofs and Computers Felix Weninger TU München April 2006 Outline Randomized computation 1 Randomized computation 2 Computation with advice Non-uniform

More information

Lecture 22: Counting

Lecture 22: Counting CS 710: Complexity Theory 4/8/2010 Lecture 22: Counting Instructor: Dieter van Melkebeek Scribe: Phil Rydzewski & Chi Man Liu Last time we introduced extractors and discussed two methods to construct them.

More information

Testing Cluster Structure of Graphs. Artur Czumaj

Testing Cluster Structure of Graphs. Artur Czumaj Testing Cluster Structure of Graphs Artur Czumaj DIMAP and Department of Computer Science University of Warwick Joint work with Pan Peng and Christian Sohler (TU Dortmund) Dealing with BigData in Graphs

More information

CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010

CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010 CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010 So far our notion of realistic computation has been completely deterministic: The Turing Machine

More information

Lecture 4: Hashing and Streaming Algorithms

Lecture 4: Hashing and Streaming Algorithms CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 4: Hashing and Streaming Algorithms Lecturer: Shayan Oveis Gharan 01/18/2017 Scribe: Yuqing Ai Disclaimer: These notes have not been subjected

More information

Testing random variables for independence and identity

Testing random variables for independence and identity Testing random variables for independence and identity Tuğkan Batu Eldar Fischer Lance Fortnow Ravi Kumar Ronitt Rubinfeld Patrick White January 10, 2003 Abstract Given access to independent samples of

More information

Lecture 2. Frequency problems

Lecture 2. Frequency problems 1 / 43 Lecture 2. Frequency problems Ricard Gavaldà MIRI Seminar on Data Streams, Spring 2015 Contents 2 / 43 1 Frequency problems in data streams 2 Approximating inner product 3 Computing frequency moments

More information

1 Approximate Counting by Random Sampling

1 Approximate Counting by Random Sampling COMP8601: Advanced Topics in Theoretical Computer Science Lecture 5: More Measure Concentration: Counting DNF Satisfying Assignments, Hoeffding s Inequality Lecturer: Hubert Chan Date: 19 Sep 2013 These

More information

Some notes on streaming algorithms continued

Some notes on streaming algorithms continued U.C. Berkeley CS170: Algorithms Handout LN-11-9 Christos Papadimitriou & Luca Trevisan November 9, 016 Some notes on streaming algorithms continued Today we complete our quick review of streaming algorithms.

More information

Advanced topic: Space complexity

Advanced topic: Space complexity Advanced topic: Space complexity CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Chinese University of Hong Kong Fall 2016 1/28 Review: time complexity We have looked at how long it takes to

More information

On the efficient approximability of constraint satisfaction problems

On the efficient approximability of constraint satisfaction problems On the efficient approximability of constraint satisfaction problems July 13, 2007 My world Max-CSP Efficient computation. P Polynomial time BPP Probabilistic Polynomial time (still efficient) NP Non-deterministic

More information

An analogy from Calculus: limits

An analogy from Calculus: limits COMP 250 Fall 2018 35 - big O Nov. 30, 2018 We have seen several algorithms in the course, and we have loosely characterized their runtimes in terms of the size n of the input. We say that the algorithm

More information

Lecture 6 September 13, 2016

Lecture 6 September 13, 2016 CS 395T: Sublinear Algorithms Fall 206 Prof. Eric Price Lecture 6 September 3, 206 Scribe: Shanshan Wu, Yitao Chen Overview Recap of last lecture. We talked about Johnson-Lindenstrauss (JL) lemma [JL84]

More information

< k 2n. 2 1 (n 2). + (1 p) s) N (n < 1

< k 2n. 2 1 (n 2). + (1 p) s) N (n < 1 List of Problems jacques@ucsd.edu Those question with a star next to them are considered slightly more challenging. Problems 9, 11, and 19 from the book The probabilistic method, by Alon and Spencer. Question

More information

Kousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13

Kousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13 Discrete Mathematics & Mathematical Reasoning Chapter 7 (continued): Markov and Chebyshev s Inequalities; and Examples in probability: the birthday problem Kousha Etessami U. of Edinburgh, UK Kousha Etessami

More information

Notes on Complexity Theory Last updated: December, Lecture 27

Notes on Complexity Theory Last updated: December, Lecture 27 Notes on Complexity Theory Last updated: December, 2011 Jonathan Katz Lecture 27 1 Space-Bounded Derandomization We now discuss derandomization of space-bounded algorithms. Here non-trivial results can

More information

6.842 Randomness and Computation Lecture 5

6.842 Randomness and Computation Lecture 5 6.842 Randomness and Computation 2012-02-22 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Michael Forbes 1 Overview Today we will define the notion of a pairwise independent hash function, and discuss its

More information

A Complexity Theory for Parallel Computation

A Complexity Theory for Parallel Computation A Complexity Theory for Parallel Computation The questions: What is the computing power of reasonable parallel computation models? Is it possible to make technology independent statements? Is it possible

More information

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler Complexity Theory Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Reinhard

More information

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181. Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität

More information

Space and Nondeterminism

Space and Nondeterminism CS 221 Computational Complexity, Lecture 5 Feb 6, 2018 Space and Nondeterminism Instructor: Madhu Sudan 1 Scribe: Yong Wook Kwon Topic Overview Today we ll talk about space and non-determinism. For some

More information

Lecture 4: Sampling, Tail Inequalities

Lecture 4: Sampling, Tail Inequalities Lecture 4: Sampling, Tail Inequalities Variance and Covariance Moment and Deviation Concentration and Tail Inequalities Sampling and Estimation c Hung Q. Ngo (SUNY at Buffalo) CSE 694 A Fun Course 1 /

More information

Testing that distributions are close

Testing that distributions are close Testing that distributions are close Tuğkan Batu Lance Fortnow Ronitt Rubinfeld Warren D. Smith Patrick White October 12, 2005 Abstract Given two distributions over an n element set, we wish to check whether

More information

Geometric Optimization Problems over Sliding Windows

Geometric Optimization Problems over Sliding Windows Geometric Optimization Problems over Sliding Windows Timothy M. Chan and Bashir S. Sadjad School of Computer Science University of Waterloo Waterloo, Ontario, N2L 3G1, Canada {tmchan,bssadjad}@uwaterloo.ca

More information

Randomized Complexity Classes; RP

Randomized Complexity Classes; RP Randomized Complexity Classes; RP Let N be a polynomial-time precise NTM that runs in time p(n) and has 2 nondeterministic choices at each step. N is a polynomial Monte Carlo Turing machine for a language

More information

Approximating Submodular Functions. Nick Harvey University of British Columbia

Approximating Submodular Functions. Nick Harvey University of British Columbia Approximating Submodular Functions Nick Harvey University of British Columbia Approximating Submodular Functions Part 1 Nick Harvey University of British Columbia Department of Computer Science July 11th,

More information

New Characterizations in Turnstile Streams with Applications

New Characterizations in Turnstile Streams with Applications New Characterizations in Turnstile Streams with Applications Yuqing Ai 1, Wei Hu 1, Yi Li 2, and David P. Woodruff 3 1 The Institute for Theoretical Computer Science (ITCS), Institute for Interdisciplinary

More information

Lecture 16 Oct. 26, 2017

Lecture 16 Oct. 26, 2017 Sketching Algorithms for Big Data Fall 2017 Prof. Piotr Indyk Lecture 16 Oct. 26, 2017 Scribe: Chi-Ning Chou 1 Overview In the last lecture we constructed sparse RIP 1 matrix via expander and showed that

More information

Estimating properties of distributions. Ronitt Rubinfeld MIT and Tel Aviv University

Estimating properties of distributions. Ronitt Rubinfeld MIT and Tel Aviv University Estimating properties of distributions Ronitt Rubinfeld MIT and Tel Aviv University Distributions are everywhere What properties do your distributions have? Play the lottery? Is it independent? Is it uniform?

More information

Induced subgraphs of prescribed size

Induced subgraphs of prescribed size Induced subgraphs of prescribed size Noga Alon Michael Krivelevich Benny Sudakov Abstract A subgraph of a graph G is called trivial if it is either a clique or an independent set. Let q(g denote the maximum

More information

Graph & Geometry Problems in Data Streams

Graph & Geometry Problems in Data Streams Graph & Geometry Problems in Data Streams 2009 Barbados Workshop on Computational Complexity Andrew McGregor Introduction Models: Graph Streams: Stream of edges E = {e 1, e 2,..., e m } describe a graph

More information

Lecture 3 Sept. 4, 2014

Lecture 3 Sept. 4, 2014 CS 395T: Sublinear Algorithms Fall 2014 Prof. Eric Price Lecture 3 Sept. 4, 2014 Scribe: Zhao Song In today s lecture, we will discuss the following problems: 1. Distinct elements 2. Turnstile model 3.

More information

CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms

CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms Instructor: Mohammad T. Hajiaghayi Scribe: Huijing Gong November 11, 2014 1 Overview In the

More information

Lecture 4. 1 Estimating the number of connected components and minimum spanning tree

Lecture 4. 1 Estimating the number of connected components and minimum spanning tree CS 59000 CTT Current Topics in Theoretical CS Aug 30, 2012 Lecturer: Elena Grigorescu Lecture 4 Scribe: Pinar Yanardag Delul 1 Estimating the number of connected components and minimum spanning tree In

More information

Randomized Time Space Tradeoffs for Directed Graph Connectivity

Randomized Time Space Tradeoffs for Directed Graph Connectivity Randomized Time Space Tradeoffs for Directed Graph Connectivity Parikshit Gopalan, Richard Lipton, and Aranyak Mehta College of Computing, Georgia Institute of Technology, Atlanta GA 30309, USA. Email:

More information

A Las Vegas approximation algorithm for metric 1-median selection

A Las Vegas approximation algorithm for metric 1-median selection A Las Vegas approximation algorithm for metric -median selection arxiv:70.0306v [cs.ds] 5 Feb 07 Ching-Lueh Chang February 8, 07 Abstract Given an n-point metric space, consider the problem of finding

More information

Topics in Probabilistic Combinatorics and Algorithms Winter, Basic Derandomization Techniques

Topics in Probabilistic Combinatorics and Algorithms Winter, Basic Derandomization Techniques Topics in Probabilistic Combinatorics and Algorithms Winter, 016 3. Basic Derandomization Techniques Definition. DTIME(t(n)) : {L : L can be decided deterministically in time O(t(n)).} EXP = { L: L can

More information

Testing that distributions are close

Testing that distributions are close Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, Patrick White May 26, 2010 Anat Ganor Introduction Goal Given two distributions over an n element set, we wish to check whether these distributions

More information

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0. Notes on Complexity Theory Last updated: October, 2005 Jonathan Katz Handout 5 1 An Improved Upper-Bound on Circuit Size Here we show the result promised in the previous lecture regarding an upper-bound

More information

Lecture 5: Derandomization (Part II)

Lecture 5: Derandomization (Part II) CS369E: Expanders May 1, 005 Lecture 5: Derandomization (Part II) Lecturer: Prahladh Harsha Scribe: Adam Barth Today we will use expanders to derandomize the algorithm for linearity test. Before presenting

More information

Space Complexity vs. Query Complexity

Space Complexity vs. Query Complexity Space Complexity vs. Query Complexity Oded Lachish 1, Ilan Newman 2, and Asaf Shapira 3 1 University of Haifa, Haifa, Israel, loded@cs.haifa.ac.il. 2 University of Haifa, Haifa, Israel, ilan@cs.haifa.ac.il.

More information

Lecture 5. 1 Review (Pairwise Independence and Derandomization)

Lecture 5. 1 Review (Pairwise Independence and Derandomization) 6.842 Randomness and Computation September 20, 2017 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Tom Kolokotrones 1 Review (Pairwise Independence and Derandomization) As we discussed last time, we can

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

Optimal compression of approximate Euclidean distances

Optimal compression of approximate Euclidean distances Optimal compression of approximate Euclidean distances Noga Alon 1 Bo az Klartag 2 Abstract Let X be a set of n points of norm at most 1 in the Euclidean space R k, and suppose ε > 0. An ε-distance sketch

More information

CS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003

CS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003 CS6999 Probabilistic Methods in Integer Programming Randomized Rounding April 2003 Overview 2 Background Randomized Rounding Handling Feasibility Derandomization Advanced Techniques Integer Programming

More information