Streaming Algorithms for Extent Problems in High Dimensions

Size: px
Start display at page:

Download "Streaming Algorithms for Extent Problems in High Dimensions"

Transcription

1 Streaming Algorithms for Extent Problems in High Dimensions Pankaj K. Agarwal R. Sharathkumar Department of Computer Science Duke University {pankaj, Abstract We develop single-pass) streaming algorithms for maintaining extent measures of a stream S of n points in R d. We focus on designing streaming algorithms whose working space is polynomial in d polyd)) and sublinear in n. For the problems of computing diameter, width and minimum enclosing ball of S, we obtain lower bounds on the worst-case approximation ratio of any streaming algorithm that uses polyd) space. On the positive side, we introduce the notion of blurred ball cover and use it for answering approximate farthestpoint queries and maintaining approximate minimum enclosing ball and diameter of S. We describe a streaming algorithm for maintaining a blurred ball cover whose working space is linear in d and independent of n. 1 Introduction In many applications, data arrives rapidly as a stream of points and there is limited space to store the input. Algorithms in the streaming model have to work with one or few passes over the data, using small space. Motivated by a wide range of applications, including networking, databases and geographic information systems, there is extensive work on streaming algorithms. Often, it is impossible to compute exact solutions given the space constraints. Hence most of the work has focused on developing approximation algorithms and several novel techniques have been developed. See books [5, 7] for a comprehensive review. In this paper, we design streaming algorithms for several geometric problems in high dimensions. Problem statement. For a point x R d and a value r > 0, let Bx,r) denote a ball of radius r centered at x. For a ball B, let cb), rb) denote its center and radius Work on this paper is supported by NSF under grants CNS , CCF , CCF and DEB , by ARO grants W911NF and W911NF , by an NIH grant 1P50-GM , by a DOE grant OEG- P00A070505, and by a grant from the U.S. Israel Binational Science Foundation. respectively, and let 1 + ε)b denote the ε-expansion of B, i.e., BcB),1 + ε)rb)). Let S be a finite) set of points in R d. We use MEBS) to denote the smallest ball that contains S. For a parameter α > 1, a ball B is called an α- approximation of the minimum enclosing ball of S, or α-mebs) for brevity, if S B and rb) α rmebs)). A set C S is called an α-coresets) if S α MEBC). Let diams) denote any farthest pair diameteral pair) of points in S, and any pair of points r,s S such that, for every pair p,q S, pq α rs, is referred to as α-diams) α-diameteral pair). For any slab J, which is a region bounded by a pair h 1,h ) of parallel d 1)-dimensional hyperplanes, let dj) be the minimum distance between h 1 and h. Let widths) be any slab J such that S J and dj) is minimized. For α > 1, α-widths) is a slab J such that S J and dwidths)) dj ) α dwidths)). Finally, for a point x R d, an α-farthest-neighbor of x, α-fnx) is a point q S such that, for every p S, xp α xq. In this paper, we study the problems of maintaining α-mebs), α-coresets), α-widths), and α-diams), and of answering α-fnx) queries, in the streaming model. We assume that the input points arrive one by one, and the algorithm updates the information quickly using little space. Our goal is to design streaming algorithms whose working space and update time are polynomial in d and sub-linear in the size of S. Related work. There is extensive literature on fast algorithms for computing various extent measures, such as diameter, width, and smallest enclosing ball of a point set in low dimensions [, 14, 15, 17, 6, 8]. Agarwal et al. [1] have developed approximation algorithms for computing many extent measures of a set of n points in R d using coresets whose running time is On + 1/ε Od) ); see also [, 4, 11, 1]. Although these algorithms work well in small dimensions, the exponential dependency on d makes them unsuitable for higher dimensions. This has led to work on develop- 1

2 ing polynomial-time approximation algorithms in n, d and 1/ε) for different extent measures and other related problems. For example, Bădoiu and Clarkson [9] develop an elegant coreset-based algorithm that given a point set S R d computes a 1 + ε)-mebs) in time Ond/ε+1/ε 5 ). See [10, 1,, 4] for other similar results. Clarkson [16] presented a general framework that gives coreset-based algorithms for a number of geometric problems in high dimensions; see also [19]. Goel et al. [0] describe fast approximation algorithms for several proximity problems. For instance, they describe a data structure that answers -FNx) queries in Od) time after spending Ond) time in preprocessing. There is also some work on streaming algorithms for extent problems in high dimensions. Chan and Zarrabi- Zadeh [9] gave a streaming algorithm, which maintains a /)-MEBS), for a stream S R d, using Od) space. They also proved that any deterministic algorithm that stores only a single ball in its working space cannot maintain a better than α-mebs), for α 1+. Indyk [] proposed a streaming algorithm for maintaining α-diams) for α >, using Odn 1/α 1) log n) space. Recently Clarkson and Woodruff [18] described streaming algorithms for several problems in numerical algebra. Our results. We prove upper and lower bounds on the size of data structures for maintaining various extent measures under the streaming model. First, we present in Section, a data structure in the streaming model called the ε-blurred ball cover, for ε > 0, that maintains a subset K S of O1/ε )log1/ε)) points. Roughly speaking K is the union of a set of {K 1,...,K u }, u = O1/ε )log1/ε)), of subsets of S each of size O1/ε) so that S lies in the union of 1 + ε)mebk i ). The data structure can be updated in Od/ε )log1/ε)) amortized time. We then show in Section that K can be used to: i) compute a α-fnx), for any query x R d and for α = + ε; ii) maintain a α-diams) for α = + ε; iii) maintain a α-coresets), for α = + ε; iv) maintain a α-mebs) for α = 1+ + ε. Next, we show in Section 4 that any randomized streaming algorithm that maintains α-diams), α- widths), α-mebs), or α-coresets) with probability at least /, requires Ωmin S,expd 1/ ))) space for certain values of α. In particular, α < 1 /d 1/ ) for α-diams) and α-coresets), α d 1/ /8 for α- widths), and α < 1+ 1 /d 1/ ) for α-mebs). All lower bounds are obtained using known lower bounds on the one-round communication complexity of indexing [5]. Communication complexity has been widely used to prove lower bounds on the size of various data structures including in the streaming model [6, 7, 18]. Our algorithms for maintaining α-diams) and α- coresets) are close to optimal. Note the contrast between the lower and upper bounds for MEB a coreset based algorithm has a lower bound of, but we are able to circumvent this bound by, roughly speaking, maintaining a set of O1/ε )log1/ε)) balls and returning a ball that contains these balls. We can regard the centers of these balls as weighted points and thus we can get better results by maintaining a weighted weak coreset it is not a subset of input points. Although it is known that weak ε-nets are more powerful than ε-nets [1], we are unaware of similar results for coresets of MEB and other extent measures. Blurred Ball Cover This section defines the notion of blurred ball cover and describes an algorithm for maintaining such a cover in the streaming model. For a parameter 0 < ε 1, an ε- blurred ball cover of a set S of n points in R d, denoted by K = KS), is a sequence K 1,K,...,K u, where each K i S is a subset of O1/ε) points that satisfies the following three properties; let B i = MEBK i ) and K = i u K i. P1) For all 1 i < u, rb i+1 ) 1 + ε /8)rB i ); P) For all i < j, K i 1 + ε)b j ; P) For every p S, i u such that p 1 + ε)b i. Algorithm 1 describes a simple procedure called UpdateK,A), that given K := KS) and a set of points A R d, computes KS A). If we update K as each new point arrives, then A consists of a single point. However, as we will see below, it will be more efficient for some of our applications to update A in a batched mode. That is, newly arrived points are stored in a buffer A and when its size exceeds certain threshold, Update procedure is called to update K. Update relies on a procedure Approx-MEBZ,ε) that takes a set Z of points and a parameter 0 < ε 1 and returns a set G Z of O1/ε) points and B = MEBG) such that Z 1 + ε)b. Bădoiu and Clarkson [9] have described such a procedure with Od Z /ε + 1/ε 5 ) running time. For efficiency reasons, we explicitly maintain B i = MEBK i ) for every K i K. Update procedure. If there is a point in A that does not lie in the union of the ε-expansions of B i s, then the Update procedure invokes Approx-MEB on K A

3 B i c i c B i B q h c i c B q a) b) Figure 1: Proof of Lemma.. a) c i c εr i /, b) c i c > εr i /. with approximation ratio ε/. Let K K be the point set and B = MEBK ) be the ball returned by Approx-MEB. The Update procedure adds K to K and then deletes all K i s for which rk i ) εrk )/4. Algorithm 1 UpdateK, A) 1: if p A, i u, p 1 + ε)b i then : K,B := Approx-MEBK A),ε/) : K D := {K i rb i ) εrb )/4} 4: K := K \ K D ) K 5: end if Correctness. The proof of correctness relies on the following well-known observation; see e.g. [10]. Lemma.1. Let P be a set of points in R d and let B = MEBP). Then any closed halfspace that contains cb) also contains at least one point of P that is on B. Lemma.. For any i < u, rb i+1 ) 1+ε /8)rB i ). Proof. Update procedure adds K i+1 to K only if there is a point q A such that q 1 + ε)b i. Let B = MEB K {q}); rb i+1 ) rb ). Let r = rb ), c = cb ), c i = cb i ), and r i = rb i ). We prove that r > 1 + ε /8)r i. If c i c εr i / Figure 1a)), then r c q c i q c c i 1 + ε)r i εr i / 1 + ε /8)r i. On the other hand, if c c i > εr i / Figure 1b)), then let h be the hyperplane passing through c i with the direction c i c as its normal, and let h + be the halfspace, bounded by h, that does not contain c. By Lemma.1, there is a point q K i h + that is at a distance r i from c i. Therefore r q c c i c + q c i ) 1/ εr i /) + r i ) 1/ 1 + ε /8)r i. We now prove that P1) P) are maintained by Update. If for every p A, there is an i u such that p 1+ε)B i, then the set K does not change, and P1) P) continue to hold. Hence, assume that there is a p A such that p B i for all i u. By Lemma., property P1) continues to hold after each update. Note that if p 1 + ε)b i for all 1 i u, then Update computes a 1+ε/)-MEBK A). Since K is the only subset added to K, K 1 + ε)b and K is added at the end of the sequence, K i 1+ε)B. Thus property P) holds after each Update. Note that a prefix K D is deleted from K, so P) may be violated. However, the next two lemmas prove that P) is also satisfied. Lemma.. For i < j, cb i ) 1 + ε)b j. Proof. Suppose, on the contrary, that cb i ) 1+ε)B j. Let γ be a halfspace that contains cb i ) but γ 1 + ε)b j =. By Lemma.1, γ contains a point p K i, but p 1+ε)B j, which contradicts the fact K i 1+ε)B j by P)). Hence, cb i ) 1 + ε)b j. Lemma.4. For all K i K D, 1 + ε)b i 1 + ε)b. Proof. Let c = cb ), r = rb ), c i = cb i ), and r i = rb i ). Let K i K D, then r i εr /4. Since K i 1 + ε/)b, the proof of Lemma. implies cb i ) 1 + ε/)b. For any point x 1 + ε)b i, xc c c i + c i x c c i ε)r i 1 + ε/)r + ε1 + ε)r /4 1 + ε)r.

4 Therefore q 1+ε)B and thus 1+ε)B i 1+ε)B. Lemma.4 immediately implies that for any K i K D, if there is a point q 1+ε)B i then q 1+ε)B. Hence, for every q S, there is an i such that q 1 + ε)b i, implying P). Size and update time. Let r i = rb i ). Update ensures that r u /r 1 4/ε. By P1), r i ε /8)r i. Therefore u log 1+ε /84/ε) = O1/ε )log1/ε)). Hence K K i = O1/ε )log1/ε)). Note that Update spends Ou. A ) = Od A /ε )log1/ε)) time to test the condition in line 1. The time spent in computing K and B in line is O A + K )d/ε + 1/ε 5 )) or Od A /ε) + d/ε 4 )log1/ε) + 1/ε 5 )). If we update K after the arrival of each new point, then the update time is Od/ε 5 ). However, if we batch the updates and invoke Update only after O1/ε ) points have arrived, the total time spent will be Od/ε 5 )log1/ε)) which is the time taken by line 1 of the update procedure. The amortized time for Update will then be Od/ε )log1/ε)). We thus conclude the following. Theorem.1. For any 0 < ε 1, an ε-blurred ball cover of size Od/ε )log1/ε)) of a stream of points can be maintained in Od/ε )log1/ε)) amortized time. Applications of Blurred Ball Cover We now describe streaming algorithms for answering + ε)-fnx) queries and for maintaining + ε)- diams), + ε)-mebs) and 1+ + ε)-mebs) using the blurred ball cover. We use the following simple inequality repeatedly. For any x, y R,.1) x + y) x + y ) 1/. Farthest-neighbor queries. We maintain an ε- blurred ball cover K and update it in the batched mode. Let A be the set of newly arrived points in S that have not been processed yet. Set Q = K A; Q = O1/ε )log1/ε)). For any query point x R d, we return the point of Q that is farthest from x. We claim that for any x R d, max xp + ε)max qx. p S q Q by P), there is an i u such that p 1 + ε)b i. Assuming c i = cb i ) and r i = rb i ),.) xp xc i + c i p xc i ε)r i. Let h be the hyperplane passing through c i and normal to xc i and let h + be the halfspace, bounded by h, that does not contain x. By Lemma.1, there is a point z K i such that zc i = r i. Since xc i z is obtuse,.) xz xc i + c i z ) 1/ xc i + r i ) 1/. By combining.) and.) and using.1), xp xc i + r i ) 1/ + εr i + ε) xz + ε) xq. We conclude the following. Theorem.1. For a stream S of points in R d and a parameter 0 < ε 1, there is a data structure of size Od/ε )log1/ε)) that answers + ε)-fnx) in time Od/ε )log1/ε)). The amortized update time is Od/ε )log1/ε)). Diameter. For a point set S, we maintain a α-diams), for α = + ε, using the above data structure for answering α-fnx) queries. Let p,q be the current α- diameteral pair maintained by the algorithm. When a new point z arrives, we compute the w = α-fnz). If wz > pq, we replace p,q) with w,z). We conclude the following. Theorem.. For a stream S of n points in R d, there is a data structure of size Od/ε )log1/ε)) that maintains a + ε)-diams). The amortized update time of this structure is Od/ε )log1/ε)). Coreset for minimum enclosing ball. For a stream S of points, we maintain an ε-blurred ball cover K and update it in the batched mode. Let A be the set of points not processed by Update and let Q = K A. Let B = MEBQ), c = cb) and r = rb). We argue that for every K i K, 1 + ε)b i + ε)b. Lemma.1. For all i u, 1 + ε)b i + ε)b. Let p = arg max p S px and q = arg max q Q qx. If p A then p = q and the claim is obviously true. If p S \A, i.e., p has been processed by Update, then, 4 Proof. For any ball B i in the blurred ball cover, let c i = cb i ) and r i = rb i ). Observe that K i B. Let t = arg max t 1+ε)Bi tc. t c = cc i ε)r i.

5 By Lemma.1, there is a point p K i such that pc cc i + ri. But pc r. Hence, t c pc cc i ε)r i cci + ri + ε, or t c +ε) pc +ε)r. Thus 1+ε)B i + ε)b. Lemma.1 in conjunction with property P) of blurred ball cover implies that S + ε)b. Thus Q is a + ε)-coresets). We conclude the following. Theorem.. For a stream S of points in R d and a parameter 0 < ε 1, there is a data structure of size Od/ε )log1/ε)) that maintains a +ε)-coresets) of minimum enclosing ball. The amortized update time of this structure is Od/ε )log1/ε)). Minimum enclosing ball. For a stream S of points, we maintain a ε/9)-blurred ball cover K of S. K is updated whenever a new point arrives. Let B = {B 1,...,B u }. Let B = MEBB). We return 1 + ε/)b which can be computed in time O1/ε 5 ) []. Hence the total update time is O1/ε 5 ). Let ˆr = rmebs)). Lemma.. rb 1 + ) ) + ε/ ˆr. Proof. Let K = K i B i B and B = {B i K i K }. Note that any subsequence of K, and hence K, satisfies properties P1) and P) of blurred ball cover. Let B t be the largest ball in B. Let c t = cb t ), r t = rb t ),c = cb ),r = rb ) and let k = r /r t. Observe that c t c = r r t = k 1)r t. Let h be a hyperplane passing through c with a normal parallel to c t c. Let h + be the halfspace bounded by h and that does not contain c t. By Lemma.1, there is a point b i B B i, for some i, such that b i c = r = kr t. Since c t c b i is obtuse, we have.4) c t b i k 1) rt + k rt. Since the proof of Lemma.1 requires only P) of blurred ball cover, it holds for any subsequence of K. In particular, it holds for K. Hence, for all B i B, B i + ε/9)b t, implying that.5) c t b i + ε/9)r t. Combining.4) and.5), we have k 1) r t + k r t + ε/9) r t 1 + ε/9) r t. 5 Solving this quadratic equation, we can deduce that, k 1+ + ε/. Thus r 1 + ) kr t kˆr + ε/ ˆr. By P), S i u1 + ε/9)b i 1 + ε/9)b. By Lemma., we have r 1 + ) + ε/ ˆr. Thus 1 + ε/)r implying that 1 + ε/)b 1 + ) + ε/ 1 + ) + ε ˆr, is indeed a MEBS). We conclude the following. 1 + ε/)ˆr 1+ ) + ε - Theorem.4. Given a stream S of points in R d, there is a data structure of size Od/ε )log1/ε)) that maintains a +1 + ε)-mebs). The worst-case) update time of this structure is Od/ε 5 ). Remark. A slightly better bound on r, the radius of the ball returned by the above algorithm can be obtained using a more involved argument, While, from the lower bounds on α-mebs) in Section 4 it follows that r > 1+, the following example shows that one cannot hope to prove r = 1+ + ε. Let the input be a stream S = p 1,p,p,p 4,..., where p 1 = 1 1,, ), p = 1 1,, ), p = 1, 1,0), p 4 = 1,0,0), and p 5,p 6,..., are points on a unit sphere in R. When the points in S arrive in this order, after the arrival of p 4, the blurred ball cover will consist of K = {{p 1,p }, {p 1,p,p }, {p 1,p,p,p 4 }}. The three balls {B 1,B,B } of K are described by ) ) 1 1 B 1 = B,,0,, ) ) 1 B = B,0,0 1,, B = B0,0,0),1).

6 Let B = MEBB 1 B ) with rb ) = 1 + )/ and cb ) = 1/ ) 1/),0,0). Let q be the farthest point of B from cb ). qcb ) = / > 1+. We evaluate B to be θ-mebs) where θ Ct,/d 1/ ) C t,/d 1/ ). Let t resp. t ) be a point on the boundary of Cs,/d 1/ ) resp. C s,/d 1/ )). Then st st st. A simple trignometric calculation shows that see Figure a)) 4 Lower Bounds In this section we show that any randomized streaming algorithm that maintains α-diams), α-widths), α- MEBS), or α-coresets) with probability at least /, requires Ωmin S,expd 1/ ))) space for certain values of α. In particular, α < 1 /d 1/ ) for α-diams) and α-coresets), α d 1/ /8 for α-widths), and α < 1+ 1 /d 1/ ) for α-mebs). Let S d 1 denote d 1)-dimensional unit sphere centered at origin. We show how to sample points from S d 1 which will be crucial for proving our lower bounds. Sampling in S d 1. Let u S d 1 and ε 0,1). Let H be the hyperplane x,u = ε, i.e., H is the hyperplane that is at distance ε from the origin and normal to the vector u. H divides S d 1 into two spherical regions. We refer to the smaller spherical region as a spherical cap and denote it by Cu,ε). We define its measure to be µu,ε) = SACu,ε)) SAS d 1 ), where SAX) is the surface area of X. known [8] that 4.6) µu,ε) exp dε /). It is well Lemma 4.1. There is a centrally symmetric point set K S d 1 of size Ωexpd 1/ )) such that for any pair of distinct points p,q K if p q, then 1 /d 1/ ) pq 1 + /d 1/ ). Proof. The set K is constructed incrementally. We maintain the invariant that if p,q K and q p then q Cp,/d 1/ ). We also maintain a centrally symmetric region F = S d 1 \ q K Cq,/d1/ ). Initially K = and F = S d 1. If F, we choose an antipodal pair of points p, p from F. By construction, p, p Cq,/d 1/ ) for any point q K. Moreover, by symmetry, no point in K lies in Cp,/d 1/ ) C p,/d 1/ ). We add p and p to K. We set F = F \ { Cp,/d 1/ ) C p,/d 1/ ) } to ensure that the invariant holds after adding p and p. The algorithm terminates when F =. By 4.6), µp,/d 1/ )+µ p,/d 1/ ) exp d 1/ ), therefore the above algorithm performs Ωexpd 1/ )) steps before it stops. Hence K = Ωexpd 1/ )). Let s,t K be such that s t, t. By construction, s 6 st st 1 + /d 1/ ) + 1 4/d / ) 1 + /d 1/ ) 1/ 1 + /d 1/ ). Similarly one can show st st 1 /d 1/ ). ) 1/ Diameter. The problem Index is defined as follows. Let Alice and Bob be two players. Suppose Alice has a binary string σ {0,1} k and Bob has an index i [1 : k]. For any j [1 : k], let σ j be the bit that appears in the jth position of σ. Alice sends a message to Bob after which Bob needs to determine σ i with probability at least /. It is well-known that the length of any message sent by Alice that helps Bob determine σ i with probability at least / is Ωk) [5]. We choose the smallest d so that k expd 1/ ). Hence k = Θexpd 1/ )). Let A be a streaming algorithm that maintains α-diams) for α 1 /d 1/ ) with probability at least /. We choose expd 1/ ) points on S d 1 using Lemma 4.1. Let L be the subset of at least k of these points that lie on the hemisphere x 1 0. We sort L in lexicographic order, which induces a map ϕ : [1 : k] L. The map ϕ is independent of the input string σ and the index i. Hence we can assume that both Alice and Bob know ϕ without communicating any bits. For any σ {0,1} k, let ϕσ) = {ϕx) σ x = 1}. Alice adds ϕσ) as input to A in an arbitrary order. Then, Alice communicates the working space of A to Bob. In order to determine the σ i, Bob adds as input, ϕi) to A. Let p,q S be the pair of points returned by A at the end of the protocol. If p = q, Bob determines σ i = 1. Else, Bob determines σ i = 0. The following lemma shows the correctness of this protocol. Lemma 4.. For any σ {0,1} k and i [1 : k]. Let S = ϕσ) { ϕσ i )} and α 1 /d 1/ ). Every α-diams) pair is an antipodal pair if and only if σ i = 1. Proof. Since all points in ϕσ) lie in one hemisphere, there is an antipodal pair of points in S if and only if ϕi) ϕσ), i.e. σ i = 1. If σ i = 0, then ϕi) ϕσ)

7 s t t t p 1 ε ε ϕi) ϕi) s a) q i 1 + ǫ) b) Figure : a) Proof of Lemma 1, b) Lower bound for α-mebs); here ε = /d 1/. and A does not return an antipodal pair of points. On the other hand, suppose σ i = 1 and suppose there is an α-diameteral pair p,q S such that p q. Since {ϕσ i ), ϕσ i )} S, diams). From Lemma 4.1, it follows that pq 1 + /d 1/ ), i.e., diams) > α pq for α 1 /d 1/ ). Thus p,q is not an α- diameteral pair implying that α-diams) is an antipodal pair of points. Since A returns an α-diams) with probablity at least /, Bob determines σ i correctly with the same probability. Since the communication complexity of any randomized algorithm for the indexing problem is Ωk) = Ωexpd 1/ )), it follows that the work space of A is Ωexpd 1/ )). We thus conclude the following. Theorem 4.1. Any streaming algorithm that maintains an α-diams) of a set S of n points in R d, for α < 1 /d 1/ ), with probability at least / requires Ωmin { n,expd 1/ ) } ) bits of storage. Minimum enclosing ball. Let A be an algorithm that maintains an α-mebs), for α 1+ 1 /d 1/ ) with probability at least /. The map ϕ is defined as above. Alice passes points in ϕσ) as input to A in an arbitrary order and communicates the working space of A to Bob. For index i, let q i be the point that is in the direction ϕi) and at distance 1 + /d 1/ ) resp /d 1/ )) from ϕi)resp. ϕi)); see Figure b). Bob adds q i as input to A. If A returns a ball of radius at least ), then Bob declares σ i = 1. Otherwise, Bob declares σ i = 0. Let S = {ϕσ) {q i }}. Note that ϕi)q i = /d 1/ ). Hence if σ i = 1, then ϕi),q i S and hence the radius of any ball containing S must be at least /d 1/ ) from ϕi). Since, q i ϕi)) = 1 + /d 1/ ), the radius of MEBS) is at most 1+/d 1/ ). In this case, the radius of any α-mebs) is at most α rmebs) ), for α < 1+ 1 /d 1/ ). Since A outputs a correct α-mebs) with probability at least /, Bob can determine σ i with the same probability. We can thus conclude that the workspace of A is Ωexpd 1/ )) and obtain the following. Theorem 4.. Any streaming algorithm that maintains an α-mebs) of a set S of n points in R d, for α < 1+ 1 /d 1/ ), with probability at least / requires Ωmin { n,expd 1/ ) } ) bits of storage. Coreset for minimum enclosing ball. The reduction to α-mebs) can be modified to obtain lower bounds on α-coreset. Let A be an algorithm that maintains a α-coresets) for α 1 /d 1/ ) with probability /. Let ϕ be a map as defined previously. Alice passes points in ϕσ) as input to A in an arbitrary order and communicates the working space of A to Bob. For index i, q i is the point as defined above; see Figure b). Let C be α-coreset maintained by A. For index i, Bob adds q i as input to A and checks whether ϕi) C. If ϕi) C, Bob reports σ i = 1. Otherwise, Bob reports σ i = 0. The correctness of this reduction follows from the following lemma. Lemma 4.. If C is an α-coresetϕσ) {q i }), for α < 1 /d 1/ ) then, i) q i C, and ii) ϕi) C if and only if σ i = 1. ϕi)q i / = /d 1/ ))/ > On the other hand, if σ i = 0, then ϕi) S. By Lemma 4.1, all points in ϕσ) are within distance 1+ 7 Proof. Let S = ϕσ) {q i }. First, we prove i). Suppose, on the contrary, that q i C and hence C ϕσ). Let B d be the unit ball centered at origin. Let β be MEBC). Since C S d 1, rβ) 1 and

8 cβ) B d. Observe that, for any point t B d, tq i 1 + /d 1/ ). Hence, cβ)q i 1 + /d 1/ ) 1 + /d 1/ )rβ) > α rβ), leading to a contradiction that C is an α-coresets). Now, we prove ii). If σ i = 0, then ϕi) S and C S implies ϕi) C. On the other hand, suppose σ i = 1 but ϕi) C. Let β = MEBC). Observe that all points in C are within a distance 1+/d 1/ ) from ϕi). Hence rβ) 1 + /d 1/ ). It follows from i) that the center cβ) β, where β = Bq i, 1 + /d 1/ )). For any point p β, pϕi), therefore cβ)ϕi) 1 /d 1/ ) 1 + /d 1/ ) α rβ), for α < 1 /d 1/ ), leading to a contradiction that C is an α-coresets). Hence, we conclude the following. Theorem 4.. Any streaming algorithm that maintains an α-coresets) of a set S of n points in R d, for α < 1 /d 1/ ), with probability at least / requires Ωmin { n,expd 1/ ) } ) bits of storage. Width. Let B be a ball centered at origin with rb) = 1/. For any vector u S d 1, let h u be a hyperplane passing through the origin and normal to u, i.e., x,u = 0. Let N u h u be a set of d points such that convn u { u,u}) contains B; see Figure. For any σ {0,1} k, let ϕσ) = { p p ϕσ)}. v v 1 u u Figure : N u = {v 1,v,v }, B convn u { u,u}). B v h u to A in an arbitrary order and then communicates the working space of A to Bob. For index i, Bob adds N ϕi) to A. If A reports a slab J such that dj) is at least 1, then Bob reports σ i = 1 and otherwise Bob reports σ i = 0. The correctness of the reduction follows from the following observation. Let S = {ϕσ) ϕσ) N ϕi) }. If σ i = 1, then {N ϕi) { ϕi),ϕi)}} S. By construction of N ϕi), B convn ϕi) {ϕi), ϕi)}). Since rb) = 1/, dj) dwidths)) 1. On the other hand, if σ i = 0, then ϕi),ϕi)) S. Let W be a slab bounded by the two hyperplanes x,ϕi) = /d 1/ and x,ϕi) = /d 1/. By Lemma 4.1 and the fact that N ϕi) h ϕi) W, we have S W and dwidths)) 4/d 1/. For α < d 1/ /8, dj) α dwidths)) α 4/d 1/ 1/. Hence, we conclude the following. Theorem 4.4. Any streaming algorithm that maintains an α-widths) of a set of n points in R d, for α < d 1/ /8, with probability at least /, requires Ωmin { n,expd 1/ ) } ) bits of storage. 5 Conclusion In this paper, we design streaming algorithms for maintaining several extent measures on high-dimensional point sets. Our approach is based on the notion of blurred ball cover that, for a stream S of points, maintains a subset K S. K can be interpreted as the union of a set of {K 1,...,K u }, u = O1/ε )log1/ε)), of subsets of S each of size O1/ε) so that S lies in the union of 1+ε)MEBK i ). We provide a simple streaming algorithm for maintaing blurred ball cover. We then use blurred ball cover to provide streaming algorithms for various extent measures. We conclude by stating related problems. Is there a fully dynamic data structure, whose size is linear in n and polyd,1/ε), that maintains a 1 + ε)-mebs)? Can the concept of blurred ball cover be generalized for maintaining approximate smallest encosing convex shape of a high dimensional point set, e.g., minimum enclosing ellipsoids, under the streaming model? The Index problem can be reduced to α-widths) as follows. Let A be a streaming algorithm that maintains α-widths), for α < d 1/ /8, with probability at least /. Alice passes points in {ϕσ) ϕσ)} 8 References [1] P. K. Agarwal, S. Har-Peled, and K. R. Varadarajan, Approximating extent measures of points, J. ACM, ),

9 [] P. K. Agarwal, S. Har-Peled, and K. R. Varadarajan, Geometric approximation via coresets, in: Combinatorial and Computational Geometry J. Goodman, J. Pach, and E.Welzl, eds.), Cambridge University Press, 005, pp [] P. K. Agarwal, J. Matoušek, and S. Suri, Farthest neighbors, maximum spanning trees and related problems in higher dimensions, Comput. Geom. Theory Appl., 1 199), [4] P. K. Agarwal and H. Yu, A space-optimal data-stream algorithm for coresets in the plane, Proc. rd Annual Sympos. Comput. Geom., 007, pp [5] C. Aggarwal, Data Streams: Models and Algorithms, Springer, 007. [6] N. Alon, Y. Matias, and M. Szegedy, The space complexity of approximating the frequency moments, J. Comput. Syst. Sci., ), [7] A. Andoni, D. Croitoru, and M. Patrascu, Hardness of nearest neighbor under L-infinity, Proc. 49th Annual IEEE Sympos. Foundations of Comp. Sc., 008, pp [8] K. Ball, An elementary introduction to modern convex geometry, in Flavors of Geometry, ), [9] M. Bădoiu and K. L. Clarkson, Optimal core-sets for balls, Comput. Geom. Theory Appl., ), 14. [10] M. Bădoiu, S. Har-Peled, and P. Indyk, Approximate clustering via core-sets, Proc. 4th Annual ACM Sympos. Theory Comput., 00, pp [11] T. M. Chan, Approximating the diameter, width, smallest enclosing cylinder, and minimum-width annulus, Intl. J. Comput. Geom. Appl., 1 00), [1] T. M. Chan, Faster core-set constructions and datastream algorithms in fixed dimensions, Comput. Geom. Theory Appl., 5 006), 0 5. [1] B. Chazelle, H. Edelsbrunner, M. Grigni, L. Guibas, M.Sharir, and E. Welzl, Improved bounds on weak ε-nets for convex sets, Discrete Comput. Geom., ), [14] B. Chazelle and J. Matoušek, On linear-time deterministic algorithms for optimization problems in fixed dimension, J. Algorithms, ), [15] K. L. Clarkson, Las Vegas algorithms for linear and integer programming when the dimension is small, J. ACM, ), [16] K. L. Clarkson, Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm, Proc. 19th Annual ACM-SIAM Sympos. Discrete Algorithms, 008, pp [17] K. L. Clarkson and P. W. Shor, Applications of random sampling in computational geometry, II., Discrete Comput. Geom., 1987), 195. [18] K. L. Clarkson and D. P. Woodruff, Numerical linear algebra in the streaming model, Proc. 41st Annual ACM Sympos. Theory of Comput., 009, pp [19] B. Gärtner and M. Jaggi, Coresets for polytope distance, Proc. 5th Annual Sympos. Comput. Geom., 009, pp. 4. [0] A. Goel, P. Indyk, and K. Varadarajan, Reductions among high dimensional proximity problems, Proc. 1th Annual ACM-SIAM Sympos. Discrete Algorithms, 001, pp [1] S. Har-Peled and K. R. Varadarajan, Projective clustering in high dimensions using core-sets, In Proc. 18th Annual Sympos. Comput. Geom, 00, pp [] P. Indyk, Better algorithms for high-dimensional proximity problems via asymmetric embeddings, Proc. 14th Annual ACM-SIAM Sympos. Discrete Algorithms, 00, pp [] P. Kumar, J. S. B. Mitchell, and E. A. Yildirim, Approximate minimum enclosing balls in high dimensions using core-sets, J. Exp. Algorithmics, 8 00), 1.1. [4] P. Kumar and E. A. Yildirim, Minimum-Volume Enclosing Ellipsoids and Core Sets, J. Optimization Theory and Appl., ), 1 1. [5] E. Kushilevitz and N. Nisan, Communication Complexity, Cambridge University Press, [6] J. Matoušek, M. Sharir, and E. Welzl, A subexponential bound for linear programming, Algorithmica, ), [7] S. Muthukrishnan, Data Streams: Algorithms and Applications, Now Publishers Inc., 005. [8] E. A. Ramos, An Optimal Deterministic Algorithm for Computing the Diameter of a three-dimensional point set, Discrete Comput. Geom., 6 001), 44. [9] H. Zarrabi-Zadeh and T. Chan, A simple streaming algorithm for minimum enclosing balls, Proc. 18th Annual Canadian Conf. Comput. Geom., 006, pp

Geometric Optimization Problems over Sliding Windows

Geometric Optimization Problems over Sliding Windows Geometric Optimization Problems over Sliding Windows Timothy M. Chan and Bashir S. Sadjad School of Computer Science University of Waterloo Waterloo, Ontario, N2L 3G1, Canada {tmchan,bssadjad}@uwaterloo.ca

More information

A Streaming Algorithm for 2-Center with Outliers in High Dimensions

A Streaming Algorithm for 2-Center with Outliers in High Dimensions CCCG 2015, Kingston, Ontario, August 10 12, 2015 A Streaming Algorithm for 2-Center with Outliers in High Dimensions Behnam Hatami Hamid Zarrabi-Zadeh Abstract We study the 2-center problem with outliers

More information

An efficient approximation for point-set diameter in higher dimensions

An efficient approximation for point-set diameter in higher dimensions CCCG 2018, Winnipeg, Canada, August 8 10, 2018 An efficient approximation for point-set diameter in higher dimensions Mahdi Imanparast Seyed Naser Hashemi Ali Mohades Abstract In this paper, we study the

More information

arxiv:cs/ v1 [cs.cg] 7 Feb 2006

arxiv:cs/ v1 [cs.cg] 7 Feb 2006 Approximate Weighted Farthest Neighbors and Minimum Dilation Stars John Augustine, David Eppstein, and Kevin A. Wortman Computer Science Department University of California, Irvine Irvine, CA 92697, USA

More information

Geometric Approximation via Coresets

Geometric Approximation via Coresets Discrete and Computational Geometry MSRI Publications Volume 52, 2005 Geometric Approximation via Coresets PANKAJ K. AGARWAL, SARIEL HAR-PELED, AND KASTURI R. VARADARAJAN Abstract. The paradigm of coresets

More information

A Fast and Simple Algorithm for Computing Approximate Euclidean Minimum Spanning Trees

A Fast and Simple Algorithm for Computing Approximate Euclidean Minimum Spanning Trees A Fast and Simple Algorithm for Computing Approximate Euclidean Minimum Spanning Trees Sunil Arya Hong Kong University of Science and Technology and David Mount University of Maryland Arya and Mount HALG

More information

A Space-Optimal Data-Stream Algorithm for Coresets in the Plane

A Space-Optimal Data-Stream Algorithm for Coresets in the Plane A Space-Optimal Data-Stream Algorithm for Coresets in the Plane Pankaj K. Agarwal Hai Yu Abstract Given a point set P R 2, a subset Q P is an ε-kernel of P if for every slab W containing Q, the (1 + ε)-expansion

More information

Succinct Data Structures for Approximating Convex Functions with Applications

Succinct Data Structures for Approximating Convex Functions with Applications Succinct Data Structures for Approximating Convex Functions with Applications Prosenjit Bose, 1 Luc Devroye and Pat Morin 1 1 School of Computer Science, Carleton University, Ottawa, Canada, K1S 5B6, {jit,morin}@cs.carleton.ca

More information

Geometric Approximation via Coresets

Geometric Approximation via Coresets Geometric Approximation via Coresets Pankaj K. Agarwal Sariel Har-Peled Kasturi R. Varadarajan October 22, 2004 Abstract The paradigm of coresets has recently emerged as a powerful tool for efficiently

More information

The τ-skyline for Uncertain Data

The τ-skyline for Uncertain Data CCCG 2014, Halifax, Nova Scotia, August 11 13, 2014 The τ-skyline for Uncertain Data Haitao Wang Wuzhou Zhang Abstract In this paper, we introduce the notion of τ-skyline as an alternative representation

More information

On Approximating the Depth and Related Problems

On Approximating the Depth and Related Problems On Approximating the Depth and Related Problems Boris Aronov Polytechnic University, Brooklyn, NY Sariel Har-Peled UIUC, Urbana, IL 1: Motivation: Operation Inflicting Freedom Input: R - set red points

More information

Approximating the Minimum Closest Pair Distance and Nearest Neighbor Distances of Linearly Moving Points

Approximating the Minimum Closest Pair Distance and Nearest Neighbor Distances of Linearly Moving Points Approximating the Minimum Closest Pair Distance and Nearest Neighbor Distances of Linearly Moving Points Timothy M. Chan Zahed Rahmati Abstract Given a set of n moving points in R d, where each point moves

More information

Projective Clustering in High Dimensions using Core-Sets

Projective Clustering in High Dimensions using Core-Sets Projective Clustering in High Dimensions using Core-Sets Sariel Har-Peled Kasturi R. Varadarajan January 31, 2003 Abstract In this paper, we show that there exists a small core-set for the problem of computing

More information

Partitions and Covers

Partitions and Covers University of California, Los Angeles CS 289A Communication Complexity Instructor: Alexander Sherstov Scribe: Dong Wang Date: January 2, 2012 LECTURE 4 Partitions and Covers In previous lectures, we saw

More information

Fly Cheaply: On the Minimum Fuel Consumption Problem

Fly Cheaply: On the Minimum Fuel Consumption Problem Journal of Algorithms 41, 330 337 (2001) doi:10.1006/jagm.2001.1189, available online at http://www.idealibrary.com on Fly Cheaply: On the Minimum Fuel Consumption Problem Timothy M. Chan Department of

More information

The 2-Center Problem in Three Dimensions

The 2-Center Problem in Three Dimensions The 2-Center Problem in Three Dimensions Pankaj K. Agarwal Rinat Ben Avraham Micha Sharir Abstract Let P be a set of n points in R 3. The 2-center problem for P is to find two congruent balls of minimum

More information

A Simple Proof of Optimal Epsilon Nets

A Simple Proof of Optimal Epsilon Nets A Simple Proof of Optimal Epsilon Nets Nabil H. Mustafa Kunal Dutta Arijit Ghosh Abstract Showing the existence of ɛ-nets of small size has been the subject of investigation for almost 30 years, starting

More information

Clustering in High Dimensions. Mihai Bădoiu

Clustering in High Dimensions. Mihai Bădoiu Clustering in High Dimensions by Mihai Bădoiu Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Bachelor of Science

More information

Approximate Range Searching: The Absolute Model

Approximate Range Searching: The Absolute Model Approximate Range Searching: The Absolute Model Guilherme D. da Fonseca Department of Computer Science University of Maryland College Park, Maryland 20742 fonseca@cs.umd.edu David M. Mount Department of

More information

Transforming Hierarchical Trees on Metric Spaces

Transforming Hierarchical Trees on Metric Spaces CCCG 016, Vancouver, British Columbia, August 3 5, 016 Transforming Hierarchical Trees on Metric Spaces Mahmoodreza Jahanseir Donald R. Sheehy Abstract We show how a simple hierarchical tree called a cover

More information

arxiv: v1 [cs.ds] 8 Aug 2014

arxiv: v1 [cs.ds] 8 Aug 2014 Asymptotically exact streaming algorithms Marc Heinrich, Alexander Munteanu, and Christian Sohler Département d Informatique, École Normale Supérieure, Paris, France marc.heinrich@ens.fr Department of

More information

CS 372: Computational Geometry Lecture 14 Geometric Approximation Algorithms

CS 372: Computational Geometry Lecture 14 Geometric Approximation Algorithms CS 372: Computational Geometry Lecture 14 Geometric Approximation Algorithms Antoine Vigneron King Abdullah University of Science and Technology December 5, 2012 Antoine Vigneron (KAUST) CS 372 Lecture

More information

arxiv: v2 [cs.cg] 28 Sep 2012

arxiv: v2 [cs.cg] 28 Sep 2012 No Dimension Independent Core-Sets for Containment under Homothetics arxiv:1010.49v [cs.cg] 8 Sep 01 René Brandenberg Zentrum Mathematik Technische Universität München Boltzmannstr. 3 85747 Garching bei

More information

Approximate Clustering via Core-Sets

Approximate Clustering via Core-Sets Approximate Clustering via Core-Sets Mihai Bădoiu Sariel Har-Peled Piotr Indyk 22/02/2002 17:40 Abstract In this paper, we show that for several clustering problems one can extract a small set of points,

More information

Algorithms for Streaming Data

Algorithms for Streaming Data Algorithms for Piotr Indyk Problems defined over elements P={p 1,,p n } The algorithm sees p 1, then p 2, then p 3, Key fact: it has limited storage Can store only s

More information

Applications of Chebyshev Polynomials to Low-Dimensional Computational Geometry

Applications of Chebyshev Polynomials to Low-Dimensional Computational Geometry Applications of Chebyshev Polynomials to Low-Dimensional Computational Geometry Timothy M. Chan 1 1 Department of Computer Science, University of Illinois at Urbana-Champaign tmc@illinois.edu Abstract

More information

A non-linear lower bound for planar epsilon-nets

A non-linear lower bound for planar epsilon-nets A non-linear lower bound for planar epsilon-nets Noga Alon Abstract We show that the minimum possible size of an ɛ-net for point objects and line (or rectangle)- ranges in the plane is (slightly) bigger

More information

Stability of ε-kernels

Stability of ε-kernels Stability of ε-kernels Pankaj K. Agarwal Jeff M. Phillips Hai Yu June 8, 2010 Abstract Given a set P of n points in R d, an ε-kernel K P approximates the directional width of P in every direction within

More information

Lower Bounds on the Complexity of Simplex Range Reporting on a Pointer Machine

Lower Bounds on the Complexity of Simplex Range Reporting on a Pointer Machine Lower Bounds on the Complexity of Simplex Range Reporting on a Pointer Machine Extended Abstract Bernard Chazelle Department of Computer Science Princeton University Burton Rosenberg Department of Mathematics

More information

On Lines and Joints. August 17, Abstract

On Lines and Joints. August 17, Abstract On Lines and Joints Haim Kaplan Micha Sharir Eugenii Shustin August 17, 2009 Abstract Let L be a set of n lines in R d, for d 3. A joint of L is a point incident to at least d lines of L, not all in a

More information

On the Optimality of the Dimensionality Reduction Method

On the Optimality of the Dimensionality Reduction Method On the Optimality of the Dimensionality Reduction Method Alexandr Andoni MIT andoni@mit.edu Piotr Indyk MIT indyk@mit.edu Mihai Pǎtraşcu MIT mip@mit.edu Abstract We investigate the optimality of (1+)-approximation

More information

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Optimal Data-Dependent Hashing for Approximate Near Neighbors Optimal Data-Dependent Hashing for Approximate Near Neighbors Alexandr Andoni 1 Ilya Razenshteyn 2 1 Simons Institute 2 MIT, CSAIL April 20, 2015 1 / 30 Nearest Neighbor Search (NNS) Let P be an n-point

More information

The Effect of Corners on the Complexity of Approximate Range Searching

The Effect of Corners on the Complexity of Approximate Range Searching Discrete Comput Geom (2009) 41: 398 443 DOI 10.1007/s00454-009-9140-z The Effect of Corners on the Complexity of Approximate Range Searching Sunil Arya Theocharis Malamatos David M. Mount Received: 31

More information

Semi-algebraic Range Reporting and Emptiness Searching with Applications

Semi-algebraic Range Reporting and Emptiness Searching with Applications Semi-algebraic Range Reporting and Emptiness Searching with Applications Micha Sharir Hayim Shaul July 14, 2009 Abstract In a typical range emptiness searching (resp., reporting) problem, we are given

More information

CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms

CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms Instructor: Mohammad T. Hajiaghayi Scribe: Huijing Gong November 11, 2014 1 Overview In the

More information

Geometric Problems in Moderate Dimensions

Geometric Problems in Moderate Dimensions Geometric Problems in Moderate Dimensions Timothy Chan UIUC Basic Problems in Comp. Geometry Orthogonal range search preprocess n points in R d s.t. we can detect, or count, or report points inside a query

More information

On the Most Likely Voronoi Diagram and Nearest Neighbor Searching?

On the Most Likely Voronoi Diagram and Nearest Neighbor Searching? On the Most Likely Voronoi Diagram and Nearest Neighbor Searching? Subhash Suri and Kevin Verbeek Department of Computer Science, University of California, Santa Barbara, USA. Abstract. We consider the

More information

arxiv: v1 [cs.ds] 18 Sep 2015

arxiv: v1 [cs.ds] 18 Sep 2015 Streaming Verification in Data Analysis Samira Daruki Justin Thaler Suresh Venkatasubramanian arxiv:1509.05514v1 [cs.ds] 18 Sep 2015 Abstract Streaming interactive proofs (SIPs) are a framework to reason

More information

Subsampling in Smoothed Range Spaces

Subsampling in Smoothed Range Spaces Subsampling in Smoothed Range Spaces Jeff M. Phillips and Yan Zheng University of Utah Abstract. We consider smoothed versions of geometric range spaces, so an element of the ground set (e.g. a point)

More information

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Lecture 10 Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Recap Sublinear time algorithms Deterministic + exact: binary search Deterministic + inexact: estimating diameter in a

More information

Higher Cell Probe Lower Bounds for Evaluating Polynomials

Higher Cell Probe Lower Bounds for Evaluating Polynomials Higher Cell Probe Lower Bounds for Evaluating Polynomials Kasper Green Larsen MADALGO, Department of Computer Science Aarhus University Aarhus, Denmark Email: larsen@cs.au.dk Abstract In this paper, we

More information

Supremum of simple stochastic processes

Supremum of simple stochastic processes Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there

More information

Similarity searching, or how to find your neighbors efficiently

Similarity searching, or how to find your neighbors efficiently Similarity searching, or how to find your neighbors efficiently Robert Krauthgamer Weizmann Institute of Science CS Research Day for Prospective Students May 1, 2009 Background Geometric spaces and techniques

More information

A Width of Points in the Streaming Model

A Width of Points in the Streaming Model A Width of Points in the Streaming Model ALEXANDR ANDONI, Microsoft Research HUY L. NGUY ÊN, Princeton University We show how to compute the width of a dynamic set of low-dimensional points in the streaming

More information

Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry

Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry Frank Nielsen 1 and Gaëtan Hadjeres 1 École Polytechnique, France/Sony Computer Science Laboratories, Japan Frank.Nielsen@acm.org

More information

Choosing Subsets with Maximum Weighted Average

Choosing Subsets with Maximum Weighted Average Choosing Subsets with Maximum Weighted Average David Eppstein Daniel S. Hirschberg Department of Information and Computer Science University of California, Irvine, CA 92717 October 4, 1996 Abstract Given

More information

Space Exploration via Proximity Search

Space Exploration via Proximity Search Space Exploration via Proximity Search Sariel Har-Peled Nirman Kumar David M. Mount Benjamin Raichel July 7, 2018 arxiv:1412.1398v1 [cs.cg] 3 Dec 2014 Abstract We investigate what computational tasks can

More information

Lecture Lecture 9 October 1, 2015

Lecture Lecture 9 October 1, 2015 CS 229r: Algorithms for Big Data Fall 2015 Lecture Lecture 9 October 1, 2015 Prof. Jelani Nelson Scribe: Rachit Singh 1 Overview In the last lecture we covered the distance to monotonicity (DTM) and longest

More information

arxiv: v1 [math.co] 25 Apr 2016

arxiv: v1 [math.co] 25 Apr 2016 On the zone complexity of a ertex Shira Zerbib arxi:604.07268 [math.co] 25 Apr 206 April 26, 206 Abstract Let L be a set of n lines in the real projectie plane in general position. We show that there exists

More information

Lecture 10 + additional notes

Lecture 10 + additional notes CSE533: Information Theorn Computer Science November 1, 2010 Lecturer: Anup Rao Lecture 10 + additional notes Scribe: Mohammad Moharrami 1 Constraint satisfaction problems We start by defining bivariate

More information

Covering an ellipsoid with equal balls

Covering an ellipsoid with equal balls Journal of Combinatorial Theory, Series A 113 (2006) 1667 1676 www.elsevier.com/locate/jcta Covering an ellipsoid with equal balls Ilya Dumer College of Engineering, University of California, Riverside,

More information

Approximate Nearest Neighbor (ANN) Search in High Dimensions

Approximate Nearest Neighbor (ANN) Search in High Dimensions Chapter 17 Approximate Nearest Neighbor (ANN) Search in High Dimensions By Sariel Har-Peled, February 4, 2010 1 17.1 ANN on the Hypercube 17.1.1 Hypercube and Hamming distance Definition 17.1.1 The set

More information

An Algorithmist s Toolkit Nov. 10, Lecture 17

An Algorithmist s Toolkit Nov. 10, Lecture 17 8.409 An Algorithmist s Toolkit Nov. 0, 009 Lecturer: Jonathan Kelner Lecture 7 Johnson-Lindenstrauss Theorem. Recap We first recap a theorem (isoperimetric inequality) and a lemma (concentration) from

More information

Subgradient and Sampling Algorithms for l 1 Regression

Subgradient and Sampling Algorithms for l 1 Regression Subgradient and Sampling Algorithms for l 1 Regression Kenneth L. Clarkson January 4, 2005 Abstract Given an n d matrix A and an n-vector b, the l 1 regression problem is to find the vector x minimizing

More information

Subgradient and Sampling Algorithms for l 1 Regression

Subgradient and Sampling Algorithms for l 1 Regression Subgradient and Sampling Algorithms for l 1 Regression Kenneth L. Clarkson Abstract Given an n d matrix A and an n-vector b, the l 1 regression problem is to find the vector x minimizing the objective

More information

Combinatorial Generalizations of Jung s Theorem

Combinatorial Generalizations of Jung s Theorem Discrete Comput Geom (013) 49:478 484 DOI 10.1007/s00454-013-9491-3 Combinatorial Generalizations of Jung s Theorem Arseniy V. Akopyan Received: 15 October 011 / Revised: 11 February 013 / Accepted: 11

More information

A Mergeable Summaries

A Mergeable Summaries A Mergeable Summaries Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang, Jeff M. Phillips, Zhewei Wei, and Ke Yi We study the mergeability of data summaries. Informally speaking, mergeability requires

More information

Communication Complexity

Communication Complexity Communication Complexity Jie Ren Adaptive Signal Processing and Information Theory Group Nov 3 rd, 2014 Jie Ren (Drexel ASPITRG) CC Nov 3 rd, 2014 1 / 77 1 E. Kushilevitz and N. Nisan, Communication Complexity,

More information

Approximating the Radii of Point Sets

Approximating the Radii of Point Sets Approximating the Radii of Point Sets Kasturi Varadarajan S. Venkatesh Yinyu Ye Jiawei Zhang April 17, 2006 Abstract We consider the problem of computing the outer-radii of point sets. In this problem,

More information

Sketching as a Tool for Numerical Linear Algebra

Sketching as a Tool for Numerical Linear Algebra Sketching as a Tool for Numerical Linear Algebra David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching for Numerical

More information

Learning convex bodies is hard

Learning convex bodies is hard Learning convex bodies is hard Navin Goyal Microsoft Research India navingo@microsoft.com Luis Rademacher Georgia Tech lrademac@cc.gatech.edu Abstract We show that learning a convex body in R d, given

More information

Clarkson s Algorithm for Violator Spaces. Yves Brise, ETH Zürich, CCCG Joint work with Bernd Gärtner

Clarkson s Algorithm for Violator Spaces. Yves Brise, ETH Zürich, CCCG Joint work with Bernd Gärtner Clarkson s Algorithm for Violator Spaces Yves Brise, ETH Zürich, CCCG 20090817 Joint work with Bernd Gärtner A Hierarchy of Optimization Frameworks Linear Programming Halfspaces (constraints), Optimization

More information

Multi-Dimensional Online Tracking

Multi-Dimensional Online Tracking Multi-Dimensional Online Tracking Ke Yi and Qin Zhang Hong Kong University of Science & Technology SODA 2009 January 4-6, 2009 1-1 A natural problem Bob: tracker f(t) g(t) Alice: observer (t, g(t)) t 2-1

More information

7. Lecture notes on the ellipsoid algorithm

7. Lecture notes on the ellipsoid algorithm Massachusetts Institute of Technology Michel X. Goemans 18.433: Combinatorial Optimization 7. Lecture notes on the ellipsoid algorithm The simplex algorithm was the first algorithm proposed for linear

More information

On Regular Vertices of the Union of Planar Convex Objects

On Regular Vertices of the Union of Planar Convex Objects On Regular Vertices of the Union of Planar Convex Objects Esther Ezra János Pach Micha Sharir Abstract Let C be a collection of n compact convex sets in the plane, such that the boundaries of any pair

More information

The Overlay of Lower Envelopes and its Applications Pankaj K. Agarwal y Otfried Schwarzkopf z Micha Sharir x September 1, 1994 Abstract Let F and G be

The Overlay of Lower Envelopes and its Applications Pankaj K. Agarwal y Otfried Schwarzkopf z Micha Sharir x September 1, 1994 Abstract Let F and G be CS-1994-18 The Overlay of Lower Envelopes and its Applications Pankaj K. Agarwal Otfried Schwarzkopf Micha Sharir Department of Computer Science Duke University Durham, North Carolina 27708{0129 September

More information

A Separator Theorem for Graphs with an Excluded Minor and its Applications

A Separator Theorem for Graphs with an Excluded Minor and its Applications A Separator Theorem for Graphs with an Excluded Minor and its Applications Noga Alon IBM Almaden Research Center, San Jose, CA 95120,USA and Sackler Faculty of Exact Sciences, Tel Aviv University, Tel

More information

Geometry. On Galleries with No Bad Points. P. Valtr. 1. Introduction

Geometry. On Galleries with No Bad Points. P. Valtr. 1. Introduction Discrete Comput Geom 21:13 200 (1) Discrete & Computational Geometry 1 Springer-Verlag New York Inc. On Galleries with No Bad Points P. Valtr Department of Applied Mathematics, Charles University, Malostranské

More information

Optimal compression of approximate Euclidean distances

Optimal compression of approximate Euclidean distances Optimal compression of approximate Euclidean distances Noga Alon 1 Bo az Klartag 2 Abstract Let X be a set of n points of norm at most 1 in the Euclidean space R k, and suppose ε > 0. An ε-distance sketch

More information

Cell-Probe Proofs and Nondeterministic Cell-Probe Complexity

Cell-Probe Proofs and Nondeterministic Cell-Probe Complexity Cell-obe oofs and Nondeterministic Cell-obe Complexity Yitong Yin Department of Computer Science, Yale University yitong.yin@yale.edu. Abstract. We study the nondeterministic cell-probe complexity of static

More information

Szemerédi-Trotter theorem and applications

Szemerédi-Trotter theorem and applications Szemerédi-Trotter theorem and applications M. Rudnev December 6, 2004 The theorem Abstract These notes cover the material of two Applied post-graduate lectures in Bristol, 2004. Szemerédi-Trotter theorem

More information

Computational Geometry: Theory and Applications

Computational Geometry: Theory and Applications Computational Geometry 43 (2010) 434 444 Contents lists available at ScienceDirect Computational Geometry: Theory and Applications www.elsevier.com/locate/comgeo Approximate range searching: The absolute

More information

Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform

Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform Nir Ailon May 22, 2007 This writeup includes very basic background material for the talk on the Fast Johnson Lindenstrauss Transform

More information

Lecture notes on the ellipsoid algorithm

Lecture notes on the ellipsoid algorithm Massachusetts Institute of Technology Handout 1 18.433: Combinatorial Optimization May 14th, 007 Michel X. Goemans Lecture notes on the ellipsoid algorithm The simplex algorithm was the first algorithm

More information

Sketching in Adversarial Environments

Sketching in Adversarial Environments Sketching in Adversarial Environments Ilya Mironov Moni Naor Gil Segev Abstract We formalize a realistic model for computations over massive data sets. The model, referred to as the adversarial sketch

More information

Economical Delone Sets for Approximating Convex Bodies

Economical Delone Sets for Approximating Convex Bodies Economical Delone Sets for Approimating Conve Bodies Ahmed Abdelkader Department of Computer Science University of Maryland College Park, MD 20742, USA akader@cs.umd.edu https://orcid.org/0000-0002-6749-1807

More information

THE UNIT DISTANCE PROBLEM ON SPHERES

THE UNIT DISTANCE PROBLEM ON SPHERES THE UNIT DISTANCE PROBLEM ON SPHERES KONRAD J. SWANEPOEL AND PAVEL VALTR Abstract. For any D > 1 and for any n 2 we construct a set of n points on a sphere in R 3 of diameter D determining at least cn

More information

Nondeterminism LECTURE Nondeterminism as a proof system. University of California, Los Angeles CS 289A Communication Complexity

Nondeterminism LECTURE Nondeterminism as a proof system. University of California, Los Angeles CS 289A Communication Complexity University of California, Los Angeles CS 289A Communication Complexity Instructor: Alexander Sherstov Scribe: Matt Brown Date: January 25, 2012 LECTURE 5 Nondeterminism In this lecture, we introduce nondeterministic

More information

A Note on Smaller Fractional Helly Numbers

A Note on Smaller Fractional Helly Numbers A Note on Smaller Fractional Helly Numbers Rom Pinchasi October 4, 204 Abstract Let F be a family of geometric objects in R d such that the complexity (number of faces of all dimensions on the boundary

More information

Coresets for k-means and k-median Clustering and their Applications

Coresets for k-means and k-median Clustering and their Applications Coresets for k-means and k-median Clustering and their Applications Sariel Har-Peled Soham Mazumdar November 7, 2003 Abstract In this paper, we show the existence of small coresets for the problems of

More information

Cell-Probe Lower Bounds for the Partial Match Problem

Cell-Probe Lower Bounds for the Partial Match Problem Cell-Probe Lower Bounds for the Partial Match Problem T.S. Jayram Subhash Khot Ravi Kumar Yuval Rabani Abstract Given a database of n points in {0, 1} d, the partial match problem is: In response to a

More information

Chapter 0 Introduction Suppose this was the abstract of a journal paper rather than the introduction to a dissertation. Then it would probably end wit

Chapter 0 Introduction Suppose this was the abstract of a journal paper rather than the introduction to a dissertation. Then it would probably end wit Chapter 0 Introduction Suppose this was the abstract of a journal paper rather than the introduction to a dissertation. Then it would probably end with some cryptic AMS subject classications and a few

More information

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window Costas Busch 1 and Srikanta Tirthapura 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY

More information

Reporting Neighbors in High-Dimensional Euclidean Space

Reporting Neighbors in High-Dimensional Euclidean Space Reporting Neighbors in High-Dimensional Euclidean Space Dror Aiger Haim Kaplan Micha Sharir Abstract We consider the following problem, which arises in many database and web-based applications: Given a

More information

A CONSTANT-FACTOR APPROXIMATION FOR MULTI-COVERING WITH DISKS

A CONSTANT-FACTOR APPROXIMATION FOR MULTI-COVERING WITH DISKS A CONSTANT-FACTOR APPROXIMATION FOR MULTI-COVERING WITH DISKS Santanu Bhowmick, Kasturi Varadarajan, and Shi-Ke Xue Abstract. We consider the following multi-covering problem with disks. We are given two

More information

A lower bound for scheduling of unit jobs with immediate decision on parallel machines

A lower bound for scheduling of unit jobs with immediate decision on parallel machines A lower bound for scheduling of unit jobs with immediate decision on parallel machines Tomáš Ebenlendr Jiří Sgall Abstract Consider scheduling of unit jobs with release times and deadlines on m identical

More information

Cell-Probe Lower Bounds for Prefix Sums and Matching Brackets

Cell-Probe Lower Bounds for Prefix Sums and Matching Brackets Cell-Probe Lower Bounds for Prefix Sums and Matching Brackets Emanuele Viola July 6, 2009 Abstract We prove that to store strings x {0, 1} n so that each prefix sum a.k.a. rank query Sumi := k i x k can

More information

A Las Vegas approximation algorithm for metric 1-median selection

A Las Vegas approximation algorithm for metric 1-median selection A Las Vegas approximation algorithm for metric -median selection arxiv:70.0306v [cs.ds] 5 Feb 07 Ching-Lueh Chang February 8, 07 Abstract Given an n-point metric space, consider the problem of finding

More information

Diameter and k-center in Sliding Windows

Diameter and k-center in Sliding Windows Diameter and k-center in Sliding Windows Vincent Cohen-Addad 1, Chris Schwiegelshohn 2, and Christian Sohler 2 1 Département d informatique, École normale supérieure, Paris, France. vincent.cohen@ens.fr

More information

The Overlay of Lower Envelopes and its Applications. March 16, Abstract

The Overlay of Lower Envelopes and its Applications. March 16, Abstract The Overlay of Lower Envelopes and its Applications Pankaj K. Agarwal y Otfried Schwarzkopf z Micha Sharir x March 16, 1996 Abstract Let F and G be two collections of a total of n (possibly partially-dened)

More information

The concentration of the chromatic number of random graphs

The concentration of the chromatic number of random graphs The concentration of the chromatic number of random graphs Noga Alon Michael Krivelevich Abstract We prove that for every constant δ > 0 the chromatic number of the random graph G(n, p) with p = n 1/2

More information

Lecture 16 Oct 21, 2014

Lecture 16 Oct 21, 2014 CS 395T: Sublinear Algorithms Fall 24 Prof. Eric Price Lecture 6 Oct 2, 24 Scribe: Chi-Kit Lam Overview In this lecture we will talk about information and compression, which the Huffman coding can achieve

More information

On (ε, k)-min-wise independent permutations

On (ε, k)-min-wise independent permutations On ε, -min-wise independent permutations Noga Alon nogaa@post.tau.ac.il Toshiya Itoh titoh@dac.gsic.titech.ac.jp Tatsuya Nagatani Nagatani.Tatsuya@aj.MitsubishiElectric.co.jp Abstract A family of permutations

More information

Nonlinear Discrete Optimization

Nonlinear Discrete Optimization Nonlinear Discrete Optimization Technion Israel Institute of Technology http://ie.technion.ac.il/~onn Billerafest 2008 - conference in honor of Lou Billera's 65th birthday (Update on Lecture Series given

More information

Euclidean Shortest Path Algorithm for Spherical Obstacles

Euclidean Shortest Path Algorithm for Spherical Obstacles Euclidean Shortest Path Algorithm for Spherical Obstacles Fajie Li 1, Gisela Klette 2, and Reinhard Klette 3 1 College of Computer Science and Technology, Huaqiao University Xiamen, Fujian, China 2 School

More information

Linear-Time Approximation Algorithms for Unit Disk Graphs

Linear-Time Approximation Algorithms for Unit Disk Graphs Linear-Time Approximation Algorithms for Unit Disk Graphs Guilherme D. da Fonseca Vinícius G. Pereira de Sá elina M. H. de Figueiredo Universidade Federal do Rio de Janeiro, Brazil Abstract Numerous approximation

More information

arxiv: v1 [cs.cg] 29 Jun 2012

arxiv: v1 [cs.cg] 29 Jun 2012 Single-Source Dilation-Bounded Minimum Spanning Trees Otfried Cheong Changryeol Lee May 2, 2014 arxiv:1206.6943v1 [cs.cg] 29 Jun 2012 Abstract Given a set S of points in the plane, a geometric network

More information

Randomized Time Space Tradeoffs for Directed Graph Connectivity

Randomized Time Space Tradeoffs for Directed Graph Connectivity Randomized Time Space Tradeoffs for Directed Graph Connectivity Parikshit Gopalan, Richard Lipton, and Aranyak Mehta College of Computing, Georgia Institute of Technology, Atlanta GA 30309, USA. Email:

More information

Union of Random Minkowski Sums and Network Vulnerability Analysis

Union of Random Minkowski Sums and Network Vulnerability Analysis Union of Random Minkowski Sums and Network Vulnerability Analysis Pankaj K. Agarwal Sariel Har-Peled Haim Kaplan Micha Sharir June 29, 2014 Abstract Let C = {C 1,..., C n } be a set of n pairwise-disjoint

More information

On the communication and streaming complexity of maximum bipartite matching

On the communication and streaming complexity of maximum bipartite matching On the communication and streaming complexity of maximum bipartite matching Ashish Goel Michael Kapralov Sanjeev Khanna November 27, 2 Abstract Consider the following communication problem. Alice holds

More information