Streaming Algorithms for Extent Problems in High Dimensions
|
|
- Clemence Elliott
- 6 years ago
- Views:
Transcription
1 Streaming Algorithms for Extent Problems in High Dimensions Pankaj K. Agarwal R. Sharathkumar Department of Computer Science Duke University {pankaj, Abstract We develop single-pass) streaming algorithms for maintaining extent measures of a stream S of n points in R d. We focus on designing streaming algorithms whose working space is polynomial in d polyd)) and sublinear in n. For the problems of computing diameter, width and minimum enclosing ball of S, we obtain lower bounds on the worst-case approximation ratio of any streaming algorithm that uses polyd) space. On the positive side, we introduce the notion of blurred ball cover and use it for answering approximate farthestpoint queries and maintaining approximate minimum enclosing ball and diameter of S. We describe a streaming algorithm for maintaining a blurred ball cover whose working space is linear in d and independent of n. 1 Introduction In many applications, data arrives rapidly as a stream of points and there is limited space to store the input. Algorithms in the streaming model have to work with one or few passes over the data, using small space. Motivated by a wide range of applications, including networking, databases and geographic information systems, there is extensive work on streaming algorithms. Often, it is impossible to compute exact solutions given the space constraints. Hence most of the work has focused on developing approximation algorithms and several novel techniques have been developed. See books [5, 7] for a comprehensive review. In this paper, we design streaming algorithms for several geometric problems in high dimensions. Problem statement. For a point x R d and a value r > 0, let Bx,r) denote a ball of radius r centered at x. For a ball B, let cb), rb) denote its center and radius Work on this paper is supported by NSF under grants CNS , CCF , CCF and DEB , by ARO grants W911NF and W911NF , by an NIH grant 1P50-GM , by a DOE grant OEG- P00A070505, and by a grant from the U.S. Israel Binational Science Foundation. respectively, and let 1 + ε)b denote the ε-expansion of B, i.e., BcB),1 + ε)rb)). Let S be a finite) set of points in R d. We use MEBS) to denote the smallest ball that contains S. For a parameter α > 1, a ball B is called an α- approximation of the minimum enclosing ball of S, or α-mebs) for brevity, if S B and rb) α rmebs)). A set C S is called an α-coresets) if S α MEBC). Let diams) denote any farthest pair diameteral pair) of points in S, and any pair of points r,s S such that, for every pair p,q S, pq α rs, is referred to as α-diams) α-diameteral pair). For any slab J, which is a region bounded by a pair h 1,h ) of parallel d 1)-dimensional hyperplanes, let dj) be the minimum distance between h 1 and h. Let widths) be any slab J such that S J and dj) is minimized. For α > 1, α-widths) is a slab J such that S J and dwidths)) dj ) α dwidths)). Finally, for a point x R d, an α-farthest-neighbor of x, α-fnx) is a point q S such that, for every p S, xp α xq. In this paper, we study the problems of maintaining α-mebs), α-coresets), α-widths), and α-diams), and of answering α-fnx) queries, in the streaming model. We assume that the input points arrive one by one, and the algorithm updates the information quickly using little space. Our goal is to design streaming algorithms whose working space and update time are polynomial in d and sub-linear in the size of S. Related work. There is extensive literature on fast algorithms for computing various extent measures, such as diameter, width, and smallest enclosing ball of a point set in low dimensions [, 14, 15, 17, 6, 8]. Agarwal et al. [1] have developed approximation algorithms for computing many extent measures of a set of n points in R d using coresets whose running time is On + 1/ε Od) ); see also [, 4, 11, 1]. Although these algorithms work well in small dimensions, the exponential dependency on d makes them unsuitable for higher dimensions. This has led to work on develop- 1
2 ing polynomial-time approximation algorithms in n, d and 1/ε) for different extent measures and other related problems. For example, Bădoiu and Clarkson [9] develop an elegant coreset-based algorithm that given a point set S R d computes a 1 + ε)-mebs) in time Ond/ε+1/ε 5 ). See [10, 1,, 4] for other similar results. Clarkson [16] presented a general framework that gives coreset-based algorithms for a number of geometric problems in high dimensions; see also [19]. Goel et al. [0] describe fast approximation algorithms for several proximity problems. For instance, they describe a data structure that answers -FNx) queries in Od) time after spending Ond) time in preprocessing. There is also some work on streaming algorithms for extent problems in high dimensions. Chan and Zarrabi- Zadeh [9] gave a streaming algorithm, which maintains a /)-MEBS), for a stream S R d, using Od) space. They also proved that any deterministic algorithm that stores only a single ball in its working space cannot maintain a better than α-mebs), for α 1+. Indyk [] proposed a streaming algorithm for maintaining α-diams) for α >, using Odn 1/α 1) log n) space. Recently Clarkson and Woodruff [18] described streaming algorithms for several problems in numerical algebra. Our results. We prove upper and lower bounds on the size of data structures for maintaining various extent measures under the streaming model. First, we present in Section, a data structure in the streaming model called the ε-blurred ball cover, for ε > 0, that maintains a subset K S of O1/ε )log1/ε)) points. Roughly speaking K is the union of a set of {K 1,...,K u }, u = O1/ε )log1/ε)), of subsets of S each of size O1/ε) so that S lies in the union of 1 + ε)mebk i ). The data structure can be updated in Od/ε )log1/ε)) amortized time. We then show in Section that K can be used to: i) compute a α-fnx), for any query x R d and for α = + ε; ii) maintain a α-diams) for α = + ε; iii) maintain a α-coresets), for α = + ε; iv) maintain a α-mebs) for α = 1+ + ε. Next, we show in Section 4 that any randomized streaming algorithm that maintains α-diams), α- widths), α-mebs), or α-coresets) with probability at least /, requires Ωmin S,expd 1/ ))) space for certain values of α. In particular, α < 1 /d 1/ ) for α-diams) and α-coresets), α d 1/ /8 for α- widths), and α < 1+ 1 /d 1/ ) for α-mebs). All lower bounds are obtained using known lower bounds on the one-round communication complexity of indexing [5]. Communication complexity has been widely used to prove lower bounds on the size of various data structures including in the streaming model [6, 7, 18]. Our algorithms for maintaining α-diams) and α- coresets) are close to optimal. Note the contrast between the lower and upper bounds for MEB a coreset based algorithm has a lower bound of, but we are able to circumvent this bound by, roughly speaking, maintaining a set of O1/ε )log1/ε)) balls and returning a ball that contains these balls. We can regard the centers of these balls as weighted points and thus we can get better results by maintaining a weighted weak coreset it is not a subset of input points. Although it is known that weak ε-nets are more powerful than ε-nets [1], we are unaware of similar results for coresets of MEB and other extent measures. Blurred Ball Cover This section defines the notion of blurred ball cover and describes an algorithm for maintaining such a cover in the streaming model. For a parameter 0 < ε 1, an ε- blurred ball cover of a set S of n points in R d, denoted by K = KS), is a sequence K 1,K,...,K u, where each K i S is a subset of O1/ε) points that satisfies the following three properties; let B i = MEBK i ) and K = i u K i. P1) For all 1 i < u, rb i+1 ) 1 + ε /8)rB i ); P) For all i < j, K i 1 + ε)b j ; P) For every p S, i u such that p 1 + ε)b i. Algorithm 1 describes a simple procedure called UpdateK,A), that given K := KS) and a set of points A R d, computes KS A). If we update K as each new point arrives, then A consists of a single point. However, as we will see below, it will be more efficient for some of our applications to update A in a batched mode. That is, newly arrived points are stored in a buffer A and when its size exceeds certain threshold, Update procedure is called to update K. Update relies on a procedure Approx-MEBZ,ε) that takes a set Z of points and a parameter 0 < ε 1 and returns a set G Z of O1/ε) points and B = MEBG) such that Z 1 + ε)b. Bădoiu and Clarkson [9] have described such a procedure with Od Z /ε + 1/ε 5 ) running time. For efficiency reasons, we explicitly maintain B i = MEBK i ) for every K i K. Update procedure. If there is a point in A that does not lie in the union of the ε-expansions of B i s, then the Update procedure invokes Approx-MEB on K A
3 B i c i c B i B q h c i c B q a) b) Figure 1: Proof of Lemma.. a) c i c εr i /, b) c i c > εr i /. with approximation ratio ε/. Let K K be the point set and B = MEBK ) be the ball returned by Approx-MEB. The Update procedure adds K to K and then deletes all K i s for which rk i ) εrk )/4. Algorithm 1 UpdateK, A) 1: if p A, i u, p 1 + ε)b i then : K,B := Approx-MEBK A),ε/) : K D := {K i rb i ) εrb )/4} 4: K := K \ K D ) K 5: end if Correctness. The proof of correctness relies on the following well-known observation; see e.g. [10]. Lemma.1. Let P be a set of points in R d and let B = MEBP). Then any closed halfspace that contains cb) also contains at least one point of P that is on B. Lemma.. For any i < u, rb i+1 ) 1+ε /8)rB i ). Proof. Update procedure adds K i+1 to K only if there is a point q A such that q 1 + ε)b i. Let B = MEB K {q}); rb i+1 ) rb ). Let r = rb ), c = cb ), c i = cb i ), and r i = rb i ). We prove that r > 1 + ε /8)r i. If c i c εr i / Figure 1a)), then r c q c i q c c i 1 + ε)r i εr i / 1 + ε /8)r i. On the other hand, if c c i > εr i / Figure 1b)), then let h be the hyperplane passing through c i with the direction c i c as its normal, and let h + be the halfspace, bounded by h, that does not contain c. By Lemma.1, there is a point q K i h + that is at a distance r i from c i. Therefore r q c c i c + q c i ) 1/ εr i /) + r i ) 1/ 1 + ε /8)r i. We now prove that P1) P) are maintained by Update. If for every p A, there is an i u such that p 1+ε)B i, then the set K does not change, and P1) P) continue to hold. Hence, assume that there is a p A such that p B i for all i u. By Lemma., property P1) continues to hold after each update. Note that if p 1 + ε)b i for all 1 i u, then Update computes a 1+ε/)-MEBK A). Since K is the only subset added to K, K 1 + ε)b and K is added at the end of the sequence, K i 1+ε)B. Thus property P) holds after each Update. Note that a prefix K D is deleted from K, so P) may be violated. However, the next two lemmas prove that P) is also satisfied. Lemma.. For i < j, cb i ) 1 + ε)b j. Proof. Suppose, on the contrary, that cb i ) 1+ε)B j. Let γ be a halfspace that contains cb i ) but γ 1 + ε)b j =. By Lemma.1, γ contains a point p K i, but p 1+ε)B j, which contradicts the fact K i 1+ε)B j by P)). Hence, cb i ) 1 + ε)b j. Lemma.4. For all K i K D, 1 + ε)b i 1 + ε)b. Proof. Let c = cb ), r = rb ), c i = cb i ), and r i = rb i ). Let K i K D, then r i εr /4. Since K i 1 + ε/)b, the proof of Lemma. implies cb i ) 1 + ε/)b. For any point x 1 + ε)b i, xc c c i + c i x c c i ε)r i 1 + ε/)r + ε1 + ε)r /4 1 + ε)r.
4 Therefore q 1+ε)B and thus 1+ε)B i 1+ε)B. Lemma.4 immediately implies that for any K i K D, if there is a point q 1+ε)B i then q 1+ε)B. Hence, for every q S, there is an i such that q 1 + ε)b i, implying P). Size and update time. Let r i = rb i ). Update ensures that r u /r 1 4/ε. By P1), r i ε /8)r i. Therefore u log 1+ε /84/ε) = O1/ε )log1/ε)). Hence K K i = O1/ε )log1/ε)). Note that Update spends Ou. A ) = Od A /ε )log1/ε)) time to test the condition in line 1. The time spent in computing K and B in line is O A + K )d/ε + 1/ε 5 )) or Od A /ε) + d/ε 4 )log1/ε) + 1/ε 5 )). If we update K after the arrival of each new point, then the update time is Od/ε 5 ). However, if we batch the updates and invoke Update only after O1/ε ) points have arrived, the total time spent will be Od/ε 5 )log1/ε)) which is the time taken by line 1 of the update procedure. The amortized time for Update will then be Od/ε )log1/ε)). We thus conclude the following. Theorem.1. For any 0 < ε 1, an ε-blurred ball cover of size Od/ε )log1/ε)) of a stream of points can be maintained in Od/ε )log1/ε)) amortized time. Applications of Blurred Ball Cover We now describe streaming algorithms for answering + ε)-fnx) queries and for maintaining + ε)- diams), + ε)-mebs) and 1+ + ε)-mebs) using the blurred ball cover. We use the following simple inequality repeatedly. For any x, y R,.1) x + y) x + y ) 1/. Farthest-neighbor queries. We maintain an ε- blurred ball cover K and update it in the batched mode. Let A be the set of newly arrived points in S that have not been processed yet. Set Q = K A; Q = O1/ε )log1/ε)). For any query point x R d, we return the point of Q that is farthest from x. We claim that for any x R d, max xp + ε)max qx. p S q Q by P), there is an i u such that p 1 + ε)b i. Assuming c i = cb i ) and r i = rb i ),.) xp xc i + c i p xc i ε)r i. Let h be the hyperplane passing through c i and normal to xc i and let h + be the halfspace, bounded by h, that does not contain x. By Lemma.1, there is a point z K i such that zc i = r i. Since xc i z is obtuse,.) xz xc i + c i z ) 1/ xc i + r i ) 1/. By combining.) and.) and using.1), xp xc i + r i ) 1/ + εr i + ε) xz + ε) xq. We conclude the following. Theorem.1. For a stream S of points in R d and a parameter 0 < ε 1, there is a data structure of size Od/ε )log1/ε)) that answers + ε)-fnx) in time Od/ε )log1/ε)). The amortized update time is Od/ε )log1/ε)). Diameter. For a point set S, we maintain a α-diams), for α = + ε, using the above data structure for answering α-fnx) queries. Let p,q be the current α- diameteral pair maintained by the algorithm. When a new point z arrives, we compute the w = α-fnz). If wz > pq, we replace p,q) with w,z). We conclude the following. Theorem.. For a stream S of n points in R d, there is a data structure of size Od/ε )log1/ε)) that maintains a + ε)-diams). The amortized update time of this structure is Od/ε )log1/ε)). Coreset for minimum enclosing ball. For a stream S of points, we maintain an ε-blurred ball cover K and update it in the batched mode. Let A be the set of points not processed by Update and let Q = K A. Let B = MEBQ), c = cb) and r = rb). We argue that for every K i K, 1 + ε)b i + ε)b. Lemma.1. For all i u, 1 + ε)b i + ε)b. Let p = arg max p S px and q = arg max q Q qx. If p A then p = q and the claim is obviously true. If p S \A, i.e., p has been processed by Update, then, 4 Proof. For any ball B i in the blurred ball cover, let c i = cb i ) and r i = rb i ). Observe that K i B. Let t = arg max t 1+ε)Bi tc. t c = cc i ε)r i.
5 By Lemma.1, there is a point p K i such that pc cc i + ri. But pc r. Hence, t c pc cc i ε)r i cci + ri + ε, or t c +ε) pc +ε)r. Thus 1+ε)B i + ε)b. Lemma.1 in conjunction with property P) of blurred ball cover implies that S + ε)b. Thus Q is a + ε)-coresets). We conclude the following. Theorem.. For a stream S of points in R d and a parameter 0 < ε 1, there is a data structure of size Od/ε )log1/ε)) that maintains a +ε)-coresets) of minimum enclosing ball. The amortized update time of this structure is Od/ε )log1/ε)). Minimum enclosing ball. For a stream S of points, we maintain a ε/9)-blurred ball cover K of S. K is updated whenever a new point arrives. Let B = {B 1,...,B u }. Let B = MEBB). We return 1 + ε/)b which can be computed in time O1/ε 5 ) []. Hence the total update time is O1/ε 5 ). Let ˆr = rmebs)). Lemma.. rb 1 + ) ) + ε/ ˆr. Proof. Let K = K i B i B and B = {B i K i K }. Note that any subsequence of K, and hence K, satisfies properties P1) and P) of blurred ball cover. Let B t be the largest ball in B. Let c t = cb t ), r t = rb t ),c = cb ),r = rb ) and let k = r /r t. Observe that c t c = r r t = k 1)r t. Let h be a hyperplane passing through c with a normal parallel to c t c. Let h + be the halfspace bounded by h and that does not contain c t. By Lemma.1, there is a point b i B B i, for some i, such that b i c = r = kr t. Since c t c b i is obtuse, we have.4) c t b i k 1) rt + k rt. Since the proof of Lemma.1 requires only P) of blurred ball cover, it holds for any subsequence of K. In particular, it holds for K. Hence, for all B i B, B i + ε/9)b t, implying that.5) c t b i + ε/9)r t. Combining.4) and.5), we have k 1) r t + k r t + ε/9) r t 1 + ε/9) r t. 5 Solving this quadratic equation, we can deduce that, k 1+ + ε/. Thus r 1 + ) kr t kˆr + ε/ ˆr. By P), S i u1 + ε/9)b i 1 + ε/9)b. By Lemma., we have r 1 + ) + ε/ ˆr. Thus 1 + ε/)r implying that 1 + ε/)b 1 + ) + ε/ 1 + ) + ε ˆr, is indeed a MEBS). We conclude the following. 1 + ε/)ˆr 1+ ) + ε - Theorem.4. Given a stream S of points in R d, there is a data structure of size Od/ε )log1/ε)) that maintains a +1 + ε)-mebs). The worst-case) update time of this structure is Od/ε 5 ). Remark. A slightly better bound on r, the radius of the ball returned by the above algorithm can be obtained using a more involved argument, While, from the lower bounds on α-mebs) in Section 4 it follows that r > 1+, the following example shows that one cannot hope to prove r = 1+ + ε. Let the input be a stream S = p 1,p,p,p 4,..., where p 1 = 1 1,, ), p = 1 1,, ), p = 1, 1,0), p 4 = 1,0,0), and p 5,p 6,..., are points on a unit sphere in R. When the points in S arrive in this order, after the arrival of p 4, the blurred ball cover will consist of K = {{p 1,p }, {p 1,p,p }, {p 1,p,p,p 4 }}. The three balls {B 1,B,B } of K are described by ) ) 1 1 B 1 = B,,0,, ) ) 1 B = B,0,0 1,, B = B0,0,0),1).
6 Let B = MEBB 1 B ) with rb ) = 1 + )/ and cb ) = 1/ ) 1/),0,0). Let q be the farthest point of B from cb ). qcb ) = / > 1+. We evaluate B to be θ-mebs) where θ Ct,/d 1/ ) C t,/d 1/ ). Let t resp. t ) be a point on the boundary of Cs,/d 1/ ) resp. C s,/d 1/ )). Then st st st. A simple trignometric calculation shows that see Figure a)) 4 Lower Bounds In this section we show that any randomized streaming algorithm that maintains α-diams), α-widths), α- MEBS), or α-coresets) with probability at least /, requires Ωmin S,expd 1/ ))) space for certain values of α. In particular, α < 1 /d 1/ ) for α-diams) and α-coresets), α d 1/ /8 for α-widths), and α < 1+ 1 /d 1/ ) for α-mebs). Let S d 1 denote d 1)-dimensional unit sphere centered at origin. We show how to sample points from S d 1 which will be crucial for proving our lower bounds. Sampling in S d 1. Let u S d 1 and ε 0,1). Let H be the hyperplane x,u = ε, i.e., H is the hyperplane that is at distance ε from the origin and normal to the vector u. H divides S d 1 into two spherical regions. We refer to the smaller spherical region as a spherical cap and denote it by Cu,ε). We define its measure to be µu,ε) = SACu,ε)) SAS d 1 ), where SAX) is the surface area of X. known [8] that 4.6) µu,ε) exp dε /). It is well Lemma 4.1. There is a centrally symmetric point set K S d 1 of size Ωexpd 1/ )) such that for any pair of distinct points p,q K if p q, then 1 /d 1/ ) pq 1 + /d 1/ ). Proof. The set K is constructed incrementally. We maintain the invariant that if p,q K and q p then q Cp,/d 1/ ). We also maintain a centrally symmetric region F = S d 1 \ q K Cq,/d1/ ). Initially K = and F = S d 1. If F, we choose an antipodal pair of points p, p from F. By construction, p, p Cq,/d 1/ ) for any point q K. Moreover, by symmetry, no point in K lies in Cp,/d 1/ ) C p,/d 1/ ). We add p and p to K. We set F = F \ { Cp,/d 1/ ) C p,/d 1/ ) } to ensure that the invariant holds after adding p and p. The algorithm terminates when F =. By 4.6), µp,/d 1/ )+µ p,/d 1/ ) exp d 1/ ), therefore the above algorithm performs Ωexpd 1/ )) steps before it stops. Hence K = Ωexpd 1/ )). Let s,t K be such that s t, t. By construction, s 6 st st 1 + /d 1/ ) + 1 4/d / ) 1 + /d 1/ ) 1/ 1 + /d 1/ ). Similarly one can show st st 1 /d 1/ ). ) 1/ Diameter. The problem Index is defined as follows. Let Alice and Bob be two players. Suppose Alice has a binary string σ {0,1} k and Bob has an index i [1 : k]. For any j [1 : k], let σ j be the bit that appears in the jth position of σ. Alice sends a message to Bob after which Bob needs to determine σ i with probability at least /. It is well-known that the length of any message sent by Alice that helps Bob determine σ i with probability at least / is Ωk) [5]. We choose the smallest d so that k expd 1/ ). Hence k = Θexpd 1/ )). Let A be a streaming algorithm that maintains α-diams) for α 1 /d 1/ ) with probability at least /. We choose expd 1/ ) points on S d 1 using Lemma 4.1. Let L be the subset of at least k of these points that lie on the hemisphere x 1 0. We sort L in lexicographic order, which induces a map ϕ : [1 : k] L. The map ϕ is independent of the input string σ and the index i. Hence we can assume that both Alice and Bob know ϕ without communicating any bits. For any σ {0,1} k, let ϕσ) = {ϕx) σ x = 1}. Alice adds ϕσ) as input to A in an arbitrary order. Then, Alice communicates the working space of A to Bob. In order to determine the σ i, Bob adds as input, ϕi) to A. Let p,q S be the pair of points returned by A at the end of the protocol. If p = q, Bob determines σ i = 1. Else, Bob determines σ i = 0. The following lemma shows the correctness of this protocol. Lemma 4.. For any σ {0,1} k and i [1 : k]. Let S = ϕσ) { ϕσ i )} and α 1 /d 1/ ). Every α-diams) pair is an antipodal pair if and only if σ i = 1. Proof. Since all points in ϕσ) lie in one hemisphere, there is an antipodal pair of points in S if and only if ϕi) ϕσ), i.e. σ i = 1. If σ i = 0, then ϕi) ϕσ)
7 s t t t p 1 ε ε ϕi) ϕi) s a) q i 1 + ǫ) b) Figure : a) Proof of Lemma 1, b) Lower bound for α-mebs); here ε = /d 1/. and A does not return an antipodal pair of points. On the other hand, suppose σ i = 1 and suppose there is an α-diameteral pair p,q S such that p q. Since {ϕσ i ), ϕσ i )} S, diams). From Lemma 4.1, it follows that pq 1 + /d 1/ ), i.e., diams) > α pq for α 1 /d 1/ ). Thus p,q is not an α- diameteral pair implying that α-diams) is an antipodal pair of points. Since A returns an α-diams) with probablity at least /, Bob determines σ i correctly with the same probability. Since the communication complexity of any randomized algorithm for the indexing problem is Ωk) = Ωexpd 1/ )), it follows that the work space of A is Ωexpd 1/ )). We thus conclude the following. Theorem 4.1. Any streaming algorithm that maintains an α-diams) of a set S of n points in R d, for α < 1 /d 1/ ), with probability at least / requires Ωmin { n,expd 1/ ) } ) bits of storage. Minimum enclosing ball. Let A be an algorithm that maintains an α-mebs), for α 1+ 1 /d 1/ ) with probability at least /. The map ϕ is defined as above. Alice passes points in ϕσ) as input to A in an arbitrary order and communicates the working space of A to Bob. For index i, let q i be the point that is in the direction ϕi) and at distance 1 + /d 1/ ) resp /d 1/ )) from ϕi)resp. ϕi)); see Figure b). Bob adds q i as input to A. If A returns a ball of radius at least ), then Bob declares σ i = 1. Otherwise, Bob declares σ i = 0. Let S = {ϕσ) {q i }}. Note that ϕi)q i = /d 1/ ). Hence if σ i = 1, then ϕi),q i S and hence the radius of any ball containing S must be at least /d 1/ ) from ϕi). Since, q i ϕi)) = 1 + /d 1/ ), the radius of MEBS) is at most 1+/d 1/ ). In this case, the radius of any α-mebs) is at most α rmebs) ), for α < 1+ 1 /d 1/ ). Since A outputs a correct α-mebs) with probability at least /, Bob can determine σ i with the same probability. We can thus conclude that the workspace of A is Ωexpd 1/ )) and obtain the following. Theorem 4.. Any streaming algorithm that maintains an α-mebs) of a set S of n points in R d, for α < 1+ 1 /d 1/ ), with probability at least / requires Ωmin { n,expd 1/ ) } ) bits of storage. Coreset for minimum enclosing ball. The reduction to α-mebs) can be modified to obtain lower bounds on α-coreset. Let A be an algorithm that maintains a α-coresets) for α 1 /d 1/ ) with probability /. Let ϕ be a map as defined previously. Alice passes points in ϕσ) as input to A in an arbitrary order and communicates the working space of A to Bob. For index i, q i is the point as defined above; see Figure b). Let C be α-coreset maintained by A. For index i, Bob adds q i as input to A and checks whether ϕi) C. If ϕi) C, Bob reports σ i = 1. Otherwise, Bob reports σ i = 0. The correctness of this reduction follows from the following lemma. Lemma 4.. If C is an α-coresetϕσ) {q i }), for α < 1 /d 1/ ) then, i) q i C, and ii) ϕi) C if and only if σ i = 1. ϕi)q i / = /d 1/ ))/ > On the other hand, if σ i = 0, then ϕi) S. By Lemma 4.1, all points in ϕσ) are within distance 1+ 7 Proof. Let S = ϕσ) {q i }. First, we prove i). Suppose, on the contrary, that q i C and hence C ϕσ). Let B d be the unit ball centered at origin. Let β be MEBC). Since C S d 1, rβ) 1 and
8 cβ) B d. Observe that, for any point t B d, tq i 1 + /d 1/ ). Hence, cβ)q i 1 + /d 1/ ) 1 + /d 1/ )rβ) > α rβ), leading to a contradiction that C is an α-coresets). Now, we prove ii). If σ i = 0, then ϕi) S and C S implies ϕi) C. On the other hand, suppose σ i = 1 but ϕi) C. Let β = MEBC). Observe that all points in C are within a distance 1+/d 1/ ) from ϕi). Hence rβ) 1 + /d 1/ ). It follows from i) that the center cβ) β, where β = Bq i, 1 + /d 1/ )). For any point p β, pϕi), therefore cβ)ϕi) 1 /d 1/ ) 1 + /d 1/ ) α rβ), for α < 1 /d 1/ ), leading to a contradiction that C is an α-coresets). Hence, we conclude the following. Theorem 4.. Any streaming algorithm that maintains an α-coresets) of a set S of n points in R d, for α < 1 /d 1/ ), with probability at least / requires Ωmin { n,expd 1/ ) } ) bits of storage. Width. Let B be a ball centered at origin with rb) = 1/. For any vector u S d 1, let h u be a hyperplane passing through the origin and normal to u, i.e., x,u = 0. Let N u h u be a set of d points such that convn u { u,u}) contains B; see Figure. For any σ {0,1} k, let ϕσ) = { p p ϕσ)}. v v 1 u u Figure : N u = {v 1,v,v }, B convn u { u,u}). B v h u to A in an arbitrary order and then communicates the working space of A to Bob. For index i, Bob adds N ϕi) to A. If A reports a slab J such that dj) is at least 1, then Bob reports σ i = 1 and otherwise Bob reports σ i = 0. The correctness of the reduction follows from the following observation. Let S = {ϕσ) ϕσ) N ϕi) }. If σ i = 1, then {N ϕi) { ϕi),ϕi)}} S. By construction of N ϕi), B convn ϕi) {ϕi), ϕi)}). Since rb) = 1/, dj) dwidths)) 1. On the other hand, if σ i = 0, then ϕi),ϕi)) S. Let W be a slab bounded by the two hyperplanes x,ϕi) = /d 1/ and x,ϕi) = /d 1/. By Lemma 4.1 and the fact that N ϕi) h ϕi) W, we have S W and dwidths)) 4/d 1/. For α < d 1/ /8, dj) α dwidths)) α 4/d 1/ 1/. Hence, we conclude the following. Theorem 4.4. Any streaming algorithm that maintains an α-widths) of a set of n points in R d, for α < d 1/ /8, with probability at least /, requires Ωmin { n,expd 1/ ) } ) bits of storage. 5 Conclusion In this paper, we design streaming algorithms for maintaining several extent measures on high-dimensional point sets. Our approach is based on the notion of blurred ball cover that, for a stream S of points, maintains a subset K S. K can be interpreted as the union of a set of {K 1,...,K u }, u = O1/ε )log1/ε)), of subsets of S each of size O1/ε) so that S lies in the union of 1+ε)MEBK i ). We provide a simple streaming algorithm for maintaing blurred ball cover. We then use blurred ball cover to provide streaming algorithms for various extent measures. We conclude by stating related problems. Is there a fully dynamic data structure, whose size is linear in n and polyd,1/ε), that maintains a 1 + ε)-mebs)? Can the concept of blurred ball cover be generalized for maintaining approximate smallest encosing convex shape of a high dimensional point set, e.g., minimum enclosing ellipsoids, under the streaming model? The Index problem can be reduced to α-widths) as follows. Let A be a streaming algorithm that maintains α-widths), for α < d 1/ /8, with probability at least /. Alice passes points in {ϕσ) ϕσ)} 8 References [1] P. K. Agarwal, S. Har-Peled, and K. R. Varadarajan, Approximating extent measures of points, J. ACM, ),
9 [] P. K. Agarwal, S. Har-Peled, and K. R. Varadarajan, Geometric approximation via coresets, in: Combinatorial and Computational Geometry J. Goodman, J. Pach, and E.Welzl, eds.), Cambridge University Press, 005, pp [] P. K. Agarwal, J. Matoušek, and S. Suri, Farthest neighbors, maximum spanning trees and related problems in higher dimensions, Comput. Geom. Theory Appl., 1 199), [4] P. K. Agarwal and H. Yu, A space-optimal data-stream algorithm for coresets in the plane, Proc. rd Annual Sympos. Comput. Geom., 007, pp [5] C. Aggarwal, Data Streams: Models and Algorithms, Springer, 007. [6] N. Alon, Y. Matias, and M. Szegedy, The space complexity of approximating the frequency moments, J. Comput. Syst. Sci., ), [7] A. Andoni, D. Croitoru, and M. Patrascu, Hardness of nearest neighbor under L-infinity, Proc. 49th Annual IEEE Sympos. Foundations of Comp. Sc., 008, pp [8] K. Ball, An elementary introduction to modern convex geometry, in Flavors of Geometry, ), [9] M. Bădoiu and K. L. Clarkson, Optimal core-sets for balls, Comput. Geom. Theory Appl., ), 14. [10] M. Bădoiu, S. Har-Peled, and P. Indyk, Approximate clustering via core-sets, Proc. 4th Annual ACM Sympos. Theory Comput., 00, pp [11] T. M. Chan, Approximating the diameter, width, smallest enclosing cylinder, and minimum-width annulus, Intl. J. Comput. Geom. Appl., 1 00), [1] T. M. Chan, Faster core-set constructions and datastream algorithms in fixed dimensions, Comput. Geom. Theory Appl., 5 006), 0 5. [1] B. Chazelle, H. Edelsbrunner, M. Grigni, L. Guibas, M.Sharir, and E. Welzl, Improved bounds on weak ε-nets for convex sets, Discrete Comput. Geom., ), [14] B. Chazelle and J. Matoušek, On linear-time deterministic algorithms for optimization problems in fixed dimension, J. Algorithms, ), [15] K. L. Clarkson, Las Vegas algorithms for linear and integer programming when the dimension is small, J. ACM, ), [16] K. L. Clarkson, Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm, Proc. 19th Annual ACM-SIAM Sympos. Discrete Algorithms, 008, pp [17] K. L. Clarkson and P. W. Shor, Applications of random sampling in computational geometry, II., Discrete Comput. Geom., 1987), 195. [18] K. L. Clarkson and D. P. Woodruff, Numerical linear algebra in the streaming model, Proc. 41st Annual ACM Sympos. Theory of Comput., 009, pp [19] B. Gärtner and M. Jaggi, Coresets for polytope distance, Proc. 5th Annual Sympos. Comput. Geom., 009, pp. 4. [0] A. Goel, P. Indyk, and K. Varadarajan, Reductions among high dimensional proximity problems, Proc. 1th Annual ACM-SIAM Sympos. Discrete Algorithms, 001, pp [1] S. Har-Peled and K. R. Varadarajan, Projective clustering in high dimensions using core-sets, In Proc. 18th Annual Sympos. Comput. Geom, 00, pp [] P. Indyk, Better algorithms for high-dimensional proximity problems via asymmetric embeddings, Proc. 14th Annual ACM-SIAM Sympos. Discrete Algorithms, 00, pp [] P. Kumar, J. S. B. Mitchell, and E. A. Yildirim, Approximate minimum enclosing balls in high dimensions using core-sets, J. Exp. Algorithmics, 8 00), 1.1. [4] P. Kumar and E. A. Yildirim, Minimum-Volume Enclosing Ellipsoids and Core Sets, J. Optimization Theory and Appl., ), 1 1. [5] E. Kushilevitz and N. Nisan, Communication Complexity, Cambridge University Press, [6] J. Matoušek, M. Sharir, and E. Welzl, A subexponential bound for linear programming, Algorithmica, ), [7] S. Muthukrishnan, Data Streams: Algorithms and Applications, Now Publishers Inc., 005. [8] E. A. Ramos, An Optimal Deterministic Algorithm for Computing the Diameter of a three-dimensional point set, Discrete Comput. Geom., 6 001), 44. [9] H. Zarrabi-Zadeh and T. Chan, A simple streaming algorithm for minimum enclosing balls, Proc. 18th Annual Canadian Conf. Comput. Geom., 006, pp
Geometric Optimization Problems over Sliding Windows
Geometric Optimization Problems over Sliding Windows Timothy M. Chan and Bashir S. Sadjad School of Computer Science University of Waterloo Waterloo, Ontario, N2L 3G1, Canada {tmchan,bssadjad}@uwaterloo.ca
More informationA Streaming Algorithm for 2-Center with Outliers in High Dimensions
CCCG 2015, Kingston, Ontario, August 10 12, 2015 A Streaming Algorithm for 2-Center with Outliers in High Dimensions Behnam Hatami Hamid Zarrabi-Zadeh Abstract We study the 2-center problem with outliers
More informationAn efficient approximation for point-set diameter in higher dimensions
CCCG 2018, Winnipeg, Canada, August 8 10, 2018 An efficient approximation for point-set diameter in higher dimensions Mahdi Imanparast Seyed Naser Hashemi Ali Mohades Abstract In this paper, we study the
More informationarxiv:cs/ v1 [cs.cg] 7 Feb 2006
Approximate Weighted Farthest Neighbors and Minimum Dilation Stars John Augustine, David Eppstein, and Kevin A. Wortman Computer Science Department University of California, Irvine Irvine, CA 92697, USA
More informationGeometric Approximation via Coresets
Discrete and Computational Geometry MSRI Publications Volume 52, 2005 Geometric Approximation via Coresets PANKAJ K. AGARWAL, SARIEL HAR-PELED, AND KASTURI R. VARADARAJAN Abstract. The paradigm of coresets
More informationA Fast and Simple Algorithm for Computing Approximate Euclidean Minimum Spanning Trees
A Fast and Simple Algorithm for Computing Approximate Euclidean Minimum Spanning Trees Sunil Arya Hong Kong University of Science and Technology and David Mount University of Maryland Arya and Mount HALG
More informationA Space-Optimal Data-Stream Algorithm for Coresets in the Plane
A Space-Optimal Data-Stream Algorithm for Coresets in the Plane Pankaj K. Agarwal Hai Yu Abstract Given a point set P R 2, a subset Q P is an ε-kernel of P if for every slab W containing Q, the (1 + ε)-expansion
More informationSuccinct Data Structures for Approximating Convex Functions with Applications
Succinct Data Structures for Approximating Convex Functions with Applications Prosenjit Bose, 1 Luc Devroye and Pat Morin 1 1 School of Computer Science, Carleton University, Ottawa, Canada, K1S 5B6, {jit,morin}@cs.carleton.ca
More informationGeometric Approximation via Coresets
Geometric Approximation via Coresets Pankaj K. Agarwal Sariel Har-Peled Kasturi R. Varadarajan October 22, 2004 Abstract The paradigm of coresets has recently emerged as a powerful tool for efficiently
More informationThe τ-skyline for Uncertain Data
CCCG 2014, Halifax, Nova Scotia, August 11 13, 2014 The τ-skyline for Uncertain Data Haitao Wang Wuzhou Zhang Abstract In this paper, we introduce the notion of τ-skyline as an alternative representation
More informationOn Approximating the Depth and Related Problems
On Approximating the Depth and Related Problems Boris Aronov Polytechnic University, Brooklyn, NY Sariel Har-Peled UIUC, Urbana, IL 1: Motivation: Operation Inflicting Freedom Input: R - set red points
More informationApproximating the Minimum Closest Pair Distance and Nearest Neighbor Distances of Linearly Moving Points
Approximating the Minimum Closest Pair Distance and Nearest Neighbor Distances of Linearly Moving Points Timothy M. Chan Zahed Rahmati Abstract Given a set of n moving points in R d, where each point moves
More informationProjective Clustering in High Dimensions using Core-Sets
Projective Clustering in High Dimensions using Core-Sets Sariel Har-Peled Kasturi R. Varadarajan January 31, 2003 Abstract In this paper, we show that there exists a small core-set for the problem of computing
More informationPartitions and Covers
University of California, Los Angeles CS 289A Communication Complexity Instructor: Alexander Sherstov Scribe: Dong Wang Date: January 2, 2012 LECTURE 4 Partitions and Covers In previous lectures, we saw
More informationFly Cheaply: On the Minimum Fuel Consumption Problem
Journal of Algorithms 41, 330 337 (2001) doi:10.1006/jagm.2001.1189, available online at http://www.idealibrary.com on Fly Cheaply: On the Minimum Fuel Consumption Problem Timothy M. Chan Department of
More informationThe 2-Center Problem in Three Dimensions
The 2-Center Problem in Three Dimensions Pankaj K. Agarwal Rinat Ben Avraham Micha Sharir Abstract Let P be a set of n points in R 3. The 2-center problem for P is to find two congruent balls of minimum
More informationA Simple Proof of Optimal Epsilon Nets
A Simple Proof of Optimal Epsilon Nets Nabil H. Mustafa Kunal Dutta Arijit Ghosh Abstract Showing the existence of ɛ-nets of small size has been the subject of investigation for almost 30 years, starting
More informationClustering in High Dimensions. Mihai Bădoiu
Clustering in High Dimensions by Mihai Bădoiu Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Bachelor of Science
More informationApproximate Range Searching: The Absolute Model
Approximate Range Searching: The Absolute Model Guilherme D. da Fonseca Department of Computer Science University of Maryland College Park, Maryland 20742 fonseca@cs.umd.edu David M. Mount Department of
More informationTransforming Hierarchical Trees on Metric Spaces
CCCG 016, Vancouver, British Columbia, August 3 5, 016 Transforming Hierarchical Trees on Metric Spaces Mahmoodreza Jahanseir Donald R. Sheehy Abstract We show how a simple hierarchical tree called a cover
More informationarxiv: v1 [cs.ds] 8 Aug 2014
Asymptotically exact streaming algorithms Marc Heinrich, Alexander Munteanu, and Christian Sohler Département d Informatique, École Normale Supérieure, Paris, France marc.heinrich@ens.fr Department of
More informationCS 372: Computational Geometry Lecture 14 Geometric Approximation Algorithms
CS 372: Computational Geometry Lecture 14 Geometric Approximation Algorithms Antoine Vigneron King Abdullah University of Science and Technology December 5, 2012 Antoine Vigneron (KAUST) CS 372 Lecture
More informationarxiv: v2 [cs.cg] 28 Sep 2012
No Dimension Independent Core-Sets for Containment under Homothetics arxiv:1010.49v [cs.cg] 8 Sep 01 René Brandenberg Zentrum Mathematik Technische Universität München Boltzmannstr. 3 85747 Garching bei
More informationApproximate Clustering via Core-Sets
Approximate Clustering via Core-Sets Mihai Bădoiu Sariel Har-Peled Piotr Indyk 22/02/2002 17:40 Abstract In this paper, we show that for several clustering problems one can extract a small set of points,
More informationAlgorithms for Streaming Data
Algorithms for Piotr Indyk Problems defined over elements P={p 1,,p n } The algorithm sees p 1, then p 2, then p 3, Key fact: it has limited storage Can store only s
More informationApplications of Chebyshev Polynomials to Low-Dimensional Computational Geometry
Applications of Chebyshev Polynomials to Low-Dimensional Computational Geometry Timothy M. Chan 1 1 Department of Computer Science, University of Illinois at Urbana-Champaign tmc@illinois.edu Abstract
More informationA non-linear lower bound for planar epsilon-nets
A non-linear lower bound for planar epsilon-nets Noga Alon Abstract We show that the minimum possible size of an ɛ-net for point objects and line (or rectangle)- ranges in the plane is (slightly) bigger
More informationStability of ε-kernels
Stability of ε-kernels Pankaj K. Agarwal Jeff M. Phillips Hai Yu June 8, 2010 Abstract Given a set P of n points in R d, an ε-kernel K P approximates the directional width of P in every direction within
More informationLower Bounds on the Complexity of Simplex Range Reporting on a Pointer Machine
Lower Bounds on the Complexity of Simplex Range Reporting on a Pointer Machine Extended Abstract Bernard Chazelle Department of Computer Science Princeton University Burton Rosenberg Department of Mathematics
More informationOn Lines and Joints. August 17, Abstract
On Lines and Joints Haim Kaplan Micha Sharir Eugenii Shustin August 17, 2009 Abstract Let L be a set of n lines in R d, for d 3. A joint of L is a point incident to at least d lines of L, not all in a
More informationOn the Optimality of the Dimensionality Reduction Method
On the Optimality of the Dimensionality Reduction Method Alexandr Andoni MIT andoni@mit.edu Piotr Indyk MIT indyk@mit.edu Mihai Pǎtraşcu MIT mip@mit.edu Abstract We investigate the optimality of (1+)-approximation
More informationOptimal Data-Dependent Hashing for Approximate Near Neighbors
Optimal Data-Dependent Hashing for Approximate Near Neighbors Alexandr Andoni 1 Ilya Razenshteyn 2 1 Simons Institute 2 MIT, CSAIL April 20, 2015 1 / 30 Nearest Neighbor Search (NNS) Let P be an n-point
More informationThe Effect of Corners on the Complexity of Approximate Range Searching
Discrete Comput Geom (2009) 41: 398 443 DOI 10.1007/s00454-009-9140-z The Effect of Corners on the Complexity of Approximate Range Searching Sunil Arya Theocharis Malamatos David M. Mount Received: 31
More informationSemi-algebraic Range Reporting and Emptiness Searching with Applications
Semi-algebraic Range Reporting and Emptiness Searching with Applications Micha Sharir Hayim Shaul July 14, 2009 Abstract In a typical range emptiness searching (resp., reporting) problem, we are given
More informationCMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms
CMSC 858F: Algorithmic Lower Bounds: Fun with Hardness Proofs Fall 2014 Introduction to Streaming Algorithms Instructor: Mohammad T. Hajiaghayi Scribe: Huijing Gong November 11, 2014 1 Overview In the
More informationGeometric Problems in Moderate Dimensions
Geometric Problems in Moderate Dimensions Timothy Chan UIUC Basic Problems in Comp. Geometry Orthogonal range search preprocess n points in R d s.t. we can detect, or count, or report points inside a query
More informationOn the Most Likely Voronoi Diagram and Nearest Neighbor Searching?
On the Most Likely Voronoi Diagram and Nearest Neighbor Searching? Subhash Suri and Kevin Verbeek Department of Computer Science, University of California, Santa Barbara, USA. Abstract. We consider the
More informationarxiv: v1 [cs.ds] 18 Sep 2015
Streaming Verification in Data Analysis Samira Daruki Justin Thaler Suresh Venkatasubramanian arxiv:1509.05514v1 [cs.ds] 18 Sep 2015 Abstract Streaming interactive proofs (SIPs) are a framework to reason
More informationSubsampling in Smoothed Range Spaces
Subsampling in Smoothed Range Spaces Jeff M. Phillips and Yan Zheng University of Utah Abstract. We consider smoothed versions of geometric range spaces, so an element of the ground set (e.g. a point)
More informationLecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1
Lecture 10 Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Recap Sublinear time algorithms Deterministic + exact: binary search Deterministic + inexact: estimating diameter in a
More informationHigher Cell Probe Lower Bounds for Evaluating Polynomials
Higher Cell Probe Lower Bounds for Evaluating Polynomials Kasper Green Larsen MADALGO, Department of Computer Science Aarhus University Aarhus, Denmark Email: larsen@cs.au.dk Abstract In this paper, we
More informationSupremum of simple stochastic processes
Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there
More informationSimilarity searching, or how to find your neighbors efficiently
Similarity searching, or how to find your neighbors efficiently Robert Krauthgamer Weizmann Institute of Science CS Research Day for Prospective Students May 1, 2009 Background Geometric spaces and techniques
More informationA Width of Points in the Streaming Model
A Width of Points in the Streaming Model ALEXANDR ANDONI, Microsoft Research HUY L. NGUY ÊN, Princeton University We show how to compute the width of a dynamic set of low-dimensional points in the streaming
More informationApproximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry
Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry Frank Nielsen 1 and Gaëtan Hadjeres 1 École Polytechnique, France/Sony Computer Science Laboratories, Japan Frank.Nielsen@acm.org
More informationChoosing Subsets with Maximum Weighted Average
Choosing Subsets with Maximum Weighted Average David Eppstein Daniel S. Hirschberg Department of Information and Computer Science University of California, Irvine, CA 92717 October 4, 1996 Abstract Given
More informationSpace Exploration via Proximity Search
Space Exploration via Proximity Search Sariel Har-Peled Nirman Kumar David M. Mount Benjamin Raichel July 7, 2018 arxiv:1412.1398v1 [cs.cg] 3 Dec 2014 Abstract We investigate what computational tasks can
More informationLecture Lecture 9 October 1, 2015
CS 229r: Algorithms for Big Data Fall 2015 Lecture Lecture 9 October 1, 2015 Prof. Jelani Nelson Scribe: Rachit Singh 1 Overview In the last lecture we covered the distance to monotonicity (DTM) and longest
More informationarxiv: v1 [math.co] 25 Apr 2016
On the zone complexity of a ertex Shira Zerbib arxi:604.07268 [math.co] 25 Apr 206 April 26, 206 Abstract Let L be a set of n lines in the real projectie plane in general position. We show that there exists
More informationLecture 10 + additional notes
CSE533: Information Theorn Computer Science November 1, 2010 Lecturer: Anup Rao Lecture 10 + additional notes Scribe: Mohammad Moharrami 1 Constraint satisfaction problems We start by defining bivariate
More informationCovering an ellipsoid with equal balls
Journal of Combinatorial Theory, Series A 113 (2006) 1667 1676 www.elsevier.com/locate/jcta Covering an ellipsoid with equal balls Ilya Dumer College of Engineering, University of California, Riverside,
More informationApproximate Nearest Neighbor (ANN) Search in High Dimensions
Chapter 17 Approximate Nearest Neighbor (ANN) Search in High Dimensions By Sariel Har-Peled, February 4, 2010 1 17.1 ANN on the Hypercube 17.1.1 Hypercube and Hamming distance Definition 17.1.1 The set
More informationAn Algorithmist s Toolkit Nov. 10, Lecture 17
8.409 An Algorithmist s Toolkit Nov. 0, 009 Lecturer: Jonathan Kelner Lecture 7 Johnson-Lindenstrauss Theorem. Recap We first recap a theorem (isoperimetric inequality) and a lemma (concentration) from
More informationSubgradient and Sampling Algorithms for l 1 Regression
Subgradient and Sampling Algorithms for l 1 Regression Kenneth L. Clarkson January 4, 2005 Abstract Given an n d matrix A and an n-vector b, the l 1 regression problem is to find the vector x minimizing
More informationSubgradient and Sampling Algorithms for l 1 Regression
Subgradient and Sampling Algorithms for l 1 Regression Kenneth L. Clarkson Abstract Given an n d matrix A and an n-vector b, the l 1 regression problem is to find the vector x minimizing the objective
More informationCombinatorial Generalizations of Jung s Theorem
Discrete Comput Geom (013) 49:478 484 DOI 10.1007/s00454-013-9491-3 Combinatorial Generalizations of Jung s Theorem Arseniy V. Akopyan Received: 15 October 011 / Revised: 11 February 013 / Accepted: 11
More informationA Mergeable Summaries
A Mergeable Summaries Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang, Jeff M. Phillips, Zhewei Wei, and Ke Yi We study the mergeability of data summaries. Informally speaking, mergeability requires
More informationCommunication Complexity
Communication Complexity Jie Ren Adaptive Signal Processing and Information Theory Group Nov 3 rd, 2014 Jie Ren (Drexel ASPITRG) CC Nov 3 rd, 2014 1 / 77 1 E. Kushilevitz and N. Nisan, Communication Complexity,
More informationApproximating the Radii of Point Sets
Approximating the Radii of Point Sets Kasturi Varadarajan S. Venkatesh Yinyu Ye Jiawei Zhang April 17, 2006 Abstract We consider the problem of computing the outer-radii of point sets. In this problem,
More informationSketching as a Tool for Numerical Linear Algebra
Sketching as a Tool for Numerical Linear Algebra David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching for Numerical
More informationLearning convex bodies is hard
Learning convex bodies is hard Navin Goyal Microsoft Research India navingo@microsoft.com Luis Rademacher Georgia Tech lrademac@cc.gatech.edu Abstract We show that learning a convex body in R d, given
More informationClarkson s Algorithm for Violator Spaces. Yves Brise, ETH Zürich, CCCG Joint work with Bernd Gärtner
Clarkson s Algorithm for Violator Spaces Yves Brise, ETH Zürich, CCCG 20090817 Joint work with Bernd Gärtner A Hierarchy of Optimization Frameworks Linear Programming Halfspaces (constraints), Optimization
More informationMulti-Dimensional Online Tracking
Multi-Dimensional Online Tracking Ke Yi and Qin Zhang Hong Kong University of Science & Technology SODA 2009 January 4-6, 2009 1-1 A natural problem Bob: tracker f(t) g(t) Alice: observer (t, g(t)) t 2-1
More information7. Lecture notes on the ellipsoid algorithm
Massachusetts Institute of Technology Michel X. Goemans 18.433: Combinatorial Optimization 7. Lecture notes on the ellipsoid algorithm The simplex algorithm was the first algorithm proposed for linear
More informationOn Regular Vertices of the Union of Planar Convex Objects
On Regular Vertices of the Union of Planar Convex Objects Esther Ezra János Pach Micha Sharir Abstract Let C be a collection of n compact convex sets in the plane, such that the boundaries of any pair
More informationThe Overlay of Lower Envelopes and its Applications Pankaj K. Agarwal y Otfried Schwarzkopf z Micha Sharir x September 1, 1994 Abstract Let F and G be
CS-1994-18 The Overlay of Lower Envelopes and its Applications Pankaj K. Agarwal Otfried Schwarzkopf Micha Sharir Department of Computer Science Duke University Durham, North Carolina 27708{0129 September
More informationA Separator Theorem for Graphs with an Excluded Minor and its Applications
A Separator Theorem for Graphs with an Excluded Minor and its Applications Noga Alon IBM Almaden Research Center, San Jose, CA 95120,USA and Sackler Faculty of Exact Sciences, Tel Aviv University, Tel
More informationGeometry. On Galleries with No Bad Points. P. Valtr. 1. Introduction
Discrete Comput Geom 21:13 200 (1) Discrete & Computational Geometry 1 Springer-Verlag New York Inc. On Galleries with No Bad Points P. Valtr Department of Applied Mathematics, Charles University, Malostranské
More informationOptimal compression of approximate Euclidean distances
Optimal compression of approximate Euclidean distances Noga Alon 1 Bo az Klartag 2 Abstract Let X be a set of n points of norm at most 1 in the Euclidean space R k, and suppose ε > 0. An ε-distance sketch
More informationCell-Probe Proofs and Nondeterministic Cell-Probe Complexity
Cell-obe oofs and Nondeterministic Cell-obe Complexity Yitong Yin Department of Computer Science, Yale University yitong.yin@yale.edu. Abstract. We study the nondeterministic cell-probe complexity of static
More informationSzemerédi-Trotter theorem and applications
Szemerédi-Trotter theorem and applications M. Rudnev December 6, 2004 The theorem Abstract These notes cover the material of two Applied post-graduate lectures in Bristol, 2004. Szemerédi-Trotter theorem
More informationComputational Geometry: Theory and Applications
Computational Geometry 43 (2010) 434 444 Contents lists available at ScienceDirect Computational Geometry: Theory and Applications www.elsevier.com/locate/comgeo Approximate range searching: The absolute
More informationSome Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform
Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform Nir Ailon May 22, 2007 This writeup includes very basic background material for the talk on the Fast Johnson Lindenstrauss Transform
More informationLecture notes on the ellipsoid algorithm
Massachusetts Institute of Technology Handout 1 18.433: Combinatorial Optimization May 14th, 007 Michel X. Goemans Lecture notes on the ellipsoid algorithm The simplex algorithm was the first algorithm
More informationSketching in Adversarial Environments
Sketching in Adversarial Environments Ilya Mironov Moni Naor Gil Segev Abstract We formalize a realistic model for computations over massive data sets. The model, referred to as the adversarial sketch
More informationEconomical Delone Sets for Approximating Convex Bodies
Economical Delone Sets for Approimating Conve Bodies Ahmed Abdelkader Department of Computer Science University of Maryland College Park, MD 20742, USA akader@cs.umd.edu https://orcid.org/0000-0002-6749-1807
More informationTHE UNIT DISTANCE PROBLEM ON SPHERES
THE UNIT DISTANCE PROBLEM ON SPHERES KONRAD J. SWANEPOEL AND PAVEL VALTR Abstract. For any D > 1 and for any n 2 we construct a set of n points on a sphere in R 3 of diameter D determining at least cn
More informationNondeterminism LECTURE Nondeterminism as a proof system. University of California, Los Angeles CS 289A Communication Complexity
University of California, Los Angeles CS 289A Communication Complexity Instructor: Alexander Sherstov Scribe: Matt Brown Date: January 25, 2012 LECTURE 5 Nondeterminism In this lecture, we introduce nondeterministic
More informationA Note on Smaller Fractional Helly Numbers
A Note on Smaller Fractional Helly Numbers Rom Pinchasi October 4, 204 Abstract Let F be a family of geometric objects in R d such that the complexity (number of faces of all dimensions on the boundary
More informationCoresets for k-means and k-median Clustering and their Applications
Coresets for k-means and k-median Clustering and their Applications Sariel Har-Peled Soham Mazumdar November 7, 2003 Abstract In this paper, we show the existence of small coresets for the problems of
More informationCell-Probe Lower Bounds for the Partial Match Problem
Cell-Probe Lower Bounds for the Partial Match Problem T.S. Jayram Subhash Khot Ravi Kumar Yuval Rabani Abstract Given a database of n points in {0, 1} d, the partial match problem is: In response to a
More informationChapter 0 Introduction Suppose this was the abstract of a journal paper rather than the introduction to a dissertation. Then it would probably end wit
Chapter 0 Introduction Suppose this was the abstract of a journal paper rather than the introduction to a dissertation. Then it would probably end with some cryptic AMS subject classications and a few
More informationA Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window
A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window Costas Busch 1 and Srikanta Tirthapura 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY
More informationReporting Neighbors in High-Dimensional Euclidean Space
Reporting Neighbors in High-Dimensional Euclidean Space Dror Aiger Haim Kaplan Micha Sharir Abstract We consider the following problem, which arises in many database and web-based applications: Given a
More informationA CONSTANT-FACTOR APPROXIMATION FOR MULTI-COVERING WITH DISKS
A CONSTANT-FACTOR APPROXIMATION FOR MULTI-COVERING WITH DISKS Santanu Bhowmick, Kasturi Varadarajan, and Shi-Ke Xue Abstract. We consider the following multi-covering problem with disks. We are given two
More informationA lower bound for scheduling of unit jobs with immediate decision on parallel machines
A lower bound for scheduling of unit jobs with immediate decision on parallel machines Tomáš Ebenlendr Jiří Sgall Abstract Consider scheduling of unit jobs with release times and deadlines on m identical
More informationCell-Probe Lower Bounds for Prefix Sums and Matching Brackets
Cell-Probe Lower Bounds for Prefix Sums and Matching Brackets Emanuele Viola July 6, 2009 Abstract We prove that to store strings x {0, 1} n so that each prefix sum a.k.a. rank query Sumi := k i x k can
More informationA Las Vegas approximation algorithm for metric 1-median selection
A Las Vegas approximation algorithm for metric -median selection arxiv:70.0306v [cs.ds] 5 Feb 07 Ching-Lueh Chang February 8, 07 Abstract Given an n-point metric space, consider the problem of finding
More informationDiameter and k-center in Sliding Windows
Diameter and k-center in Sliding Windows Vincent Cohen-Addad 1, Chris Schwiegelshohn 2, and Christian Sohler 2 1 Département d informatique, École normale supérieure, Paris, France. vincent.cohen@ens.fr
More informationThe Overlay of Lower Envelopes and its Applications. March 16, Abstract
The Overlay of Lower Envelopes and its Applications Pankaj K. Agarwal y Otfried Schwarzkopf z Micha Sharir x March 16, 1996 Abstract Let F and G be two collections of a total of n (possibly partially-dened)
More informationThe concentration of the chromatic number of random graphs
The concentration of the chromatic number of random graphs Noga Alon Michael Krivelevich Abstract We prove that for every constant δ > 0 the chromatic number of the random graph G(n, p) with p = n 1/2
More informationLecture 16 Oct 21, 2014
CS 395T: Sublinear Algorithms Fall 24 Prof. Eric Price Lecture 6 Oct 2, 24 Scribe: Chi-Kit Lam Overview In this lecture we will talk about information and compression, which the Huffman coding can achieve
More informationOn (ε, k)-min-wise independent permutations
On ε, -min-wise independent permutations Noga Alon nogaa@post.tau.ac.il Toshiya Itoh titoh@dac.gsic.titech.ac.jp Tatsuya Nagatani Nagatani.Tatsuya@aj.MitsubishiElectric.co.jp Abstract A family of permutations
More informationNonlinear Discrete Optimization
Nonlinear Discrete Optimization Technion Israel Institute of Technology http://ie.technion.ac.il/~onn Billerafest 2008 - conference in honor of Lou Billera's 65th birthday (Update on Lecture Series given
More informationEuclidean Shortest Path Algorithm for Spherical Obstacles
Euclidean Shortest Path Algorithm for Spherical Obstacles Fajie Li 1, Gisela Klette 2, and Reinhard Klette 3 1 College of Computer Science and Technology, Huaqiao University Xiamen, Fujian, China 2 School
More informationLinear-Time Approximation Algorithms for Unit Disk Graphs
Linear-Time Approximation Algorithms for Unit Disk Graphs Guilherme D. da Fonseca Vinícius G. Pereira de Sá elina M. H. de Figueiredo Universidade Federal do Rio de Janeiro, Brazil Abstract Numerous approximation
More informationarxiv: v1 [cs.cg] 29 Jun 2012
Single-Source Dilation-Bounded Minimum Spanning Trees Otfried Cheong Changryeol Lee May 2, 2014 arxiv:1206.6943v1 [cs.cg] 29 Jun 2012 Abstract Given a set S of points in the plane, a geometric network
More informationRandomized Time Space Tradeoffs for Directed Graph Connectivity
Randomized Time Space Tradeoffs for Directed Graph Connectivity Parikshit Gopalan, Richard Lipton, and Aranyak Mehta College of Computing, Georgia Institute of Technology, Atlanta GA 30309, USA. Email:
More informationUnion of Random Minkowski Sums and Network Vulnerability Analysis
Union of Random Minkowski Sums and Network Vulnerability Analysis Pankaj K. Agarwal Sariel Har-Peled Haim Kaplan Micha Sharir June 29, 2014 Abstract Let C = {C 1,..., C n } be a set of n pairwise-disjoint
More informationOn the communication and streaming complexity of maximum bipartite matching
On the communication and streaming complexity of maximum bipartite matching Ashish Goel Michael Kapralov Sanjeev Khanna November 27, 2 Abstract Consider the following communication problem. Alice holds
More information