Approximate Voronoi Diagrams

Similar documents
CS 372: Computational Geometry Lecture 14 Geometric Approximation Algorithms

arxiv: v1 [cs.cg] 5 Sep 2018

WELL-SEPARATED PAIR DECOMPOSITION FOR THE UNIT-DISK GRAPH METRIC AND ITS APPLICATIONS

Transforming Hierarchical Trees on Metric Spaces

Partitioning Metric Spaces

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Making Nearest Neighbors Easier. Restrictions on Input Algorithms for Nearest Neighbor Search: Lecture 4. Outline. Chapter XI

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Multimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data

APPROXIMATION ALGORITHMS FOR PROXIMITY AND CLUSTERING PROBLEMS YOGISH SABHARWAL

An Algorithmist s Toolkit Nov. 10, Lecture 17

Nearest Neighbor Searching Under Uncertainty. Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University

Approximate Nearest Neighbor (ANN) Search in High Dimensions

Lecture 2 September 4, 2014

A Fast and Simple Algorithm for Computing Approximate Euclidean Minimum Spanning Trees

arxiv: v2 [cs.cg] 26 Jun 2017

Space-Time Tradeoffs for Approximate Nearest Neighbor Searching

arxiv:cs/ v3 [cs.ds] 22 Aug 2005

Even More on Dynamic Programming

More Dynamic Programming

Cuts & Metrics: Multiway Cut

Proximity problems in high dimensions

Assignment 5: Solutions

Similarity Search in High Dimensions II. Piotr Indyk MIT

Ramsey partitions and proximity data structures

Nearest-Neighbor Searching Under Uncertainty

The Fast Multipole Method and other Fast Summation Techniques

(1 + ε)-approximation for Facility Location in Data Streams

Algorithms for Calculating Statistical Properties on Moving Points

Tailored Bregman Ball Trees for Effective Nearest Neighbors

Scribes: Po-Hsuan Wei, William Kuzmaul Editor: Kevin Wu Date: October 18, 2016

CSC 2414 Lattices in Computer Science September 27, Lecture 4. An Efficient Algorithm for Integer Programming in constant dimensions

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Lecture 2: The Fast Multipole Method

Ramsey partitions and proximity data structures

Approximate k-at Nearest Neighbor Search

15-854: Approximation Algorithms Lecturer: Anupam Gupta Topic: Approximating Metrics by Tree Metrics Date: 10/19/2005 Scribe: Roy Liu

Locality Sensitive Hashing

Finite Metric Spaces & Their Embeddings: Introduction and Basic Tools

Lecture 2: A Las Vegas Algorithm for finding the closest pair of points in the plane

Region-Fault Tolerant Geometric Spanners

Fly Cheaply: On the Minimum Fuel Consumption Problem

Partitions and Covers

Nearest Neighbor Search with Keywords

Random projection trees and low dimensional manifolds. Sanjoy Dasgupta and Yoav Freund University of California, San Diego

arxiv: v1 [cs.cg] 3 Aug 2011

On the Competitive Ratio for Online Facility Location

R. Lachieze-Rey Recent Berry-Esseen bounds obtained with Stein s method andgeorgia PoincareTech. inequalities, 1 / with 29 G.

Lecture 14 October 16, 2014

CS4311 Design and Analysis of Algorithms. Analysis of Union-By-Rank with Path Compression

CS 273 Prof. Serafim Batzoglou Prof. Jean-Claude Latombe Spring Lecture 12 : Energy maintenance (1) Lecturer: Prof. J.C.

Chapter 11. Min Cut Min Cut Problem Definition Some Definitions. By Sariel Har-Peled, December 10, Version: 1.

In-Class Soln 1. CS 361, Lecture 4. Today s Outline. In-Class Soln 2

Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 11: From random walk to quantum walk

A CONSTANT-FACTOR APPROXIMATION FOR MULTI-COVERING WITH DISKS

A Polynomial-Time Approximation Scheme for Euclidean Steiner Forest

arxiv:cs/ v1 [cs.cg] 7 Feb 2006

Space-Time Tradeoffs for Approximate Spherical Range Counting

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

High-Dimensional Indexing by Distributed Aggregation

A Simple Linear Time (1 + ε)-approximation Algorithm for k-means Clustering in Any Dimensions

k th -order Voronoi Diagrams

Entropy-Preserving Cuttings and Space-Efficient Planar Point Location

Asymptotic Modularity of some Graph Classes

Online Learning Summer School Copenhagen 2015 Lecture 1

An efficient approximation for point-set diameter in higher dimensions

Machine Learning Lecture 14

Midterm 2 for CS 170

A PTAS for TSP with Fat Weakly Disjoint Neighborhoods in Doubling Metrics

Connectivity of Wireless Sensor Networks with Constant Density

Lecture 18: March 15

Approximation Algorithms for Min-Sum k-clustering and Balanced k-median

Distributed Distance-Bounded Network Design Through Distributed Convex Programming

A Polynomial Time Deterministic Algorithm for Identity Testing Read-Once Polynomials

Randomization in Algorithms and Data Structures

Compressing Kinetic Data From Sensor Networks. Sorelle A. Friedler (Swat 04) Joint work with David Mount University of Maryland, College Park

Succinct Data Structures for Approximating Convex Functions with Applications

Lecture 7: Passive Learning

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Mid Term-1 : Practice problems

The Tree Evaluation Problem: Towards Separating P from NL. Lecture #6: 26 February 2014

ECE353: Probability and Random Processes. Lecture 3 - Independence and Sequential Experiments

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

Nearest-Neighbor Searching Under Uncertainty

CS151 Complexity Theory. Lecture 13 May 15, 2017

Design and Analysis of Algorithms

Jeffrey D. Ullman Stanford University

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

Introduction to Discrete Optimization

13 : Variational Inference: Loopy Belief Propagation and Mean Field

Algorithms for Picture Analysis. Lecture 07: Metrics. Axioms of a Metric

The black-box complexity of nearest neighbor search

Disjoint-Set Forests

47-831: Advanced Integer Programming Lecturer: Amitabh Basu Lecture 2 Date: 03/18/2010

Compute the Fourier transform on the first register to get x {0,1} n x 0.

Metrics: Growth, dimension, expansion

Active Learning: Disagreement Coefficient

Machine Learning for Data Science (CS4786) Lecture 11

Lecture 6 September 21, 2016

Transcription:

CS468, Mon. Oct. 30 th, 2006 Approximate Voronoi Diagrams Presentation by Maks Ovsjanikov S. Har-Peled s notes, Chapters 6 and 7 1-1

Outline Preliminaries Problem Statement ANN using PLEB } Bounds and Improvements Near Linear Space Linear Space (Previous Lecture) ANN in R d using compressed quad-trees 2-1

Preliminaries v u 3-1

Preliminaries d(u, v) v u 3-2

Preliminaries d(u, v) v u d(q, v) d(q, u) q 3-3

Preliminaries d(u, v) v u d(q, v) d(u,v) d(q, u) d(u,v) q 3-4

Preliminaries d(u, v) v u d(q, v) d(u,v) d(q, u) d(u,v) q d(q, u) d(u,v) d(q, v) d(u,v) = d(q, v) d(q, u) 1 + 3-5

Preliminaries d(q, u) = αd(u, v) d(q, v) = (1 + α)d(u, v) = (1 + 1 )d(u, v) α u d(u, v) v d(q, v) d(u,v) q d(q, u) d(u,v) q d(q, u) d(u,v) d(q, v) d(u,v) = d(q, v) d(q, u) 1 + 3-6

Preliminaries d(q, u) d(u,v) d(q, v) d(u,v) = d(q, v) d(q, u) 1 + Holds in any metric space: 4-1

Preliminaries d(q, u) d(u,v) d(q, v) d(u,v) = d(q, v) d(q, u) 1 + Holds in any metric space: d(q, u) = αd(u, v) d(q, v) d(q, u) + d(u, v) = (1 + 1 α )d(q, u) = d(q,v) d(q,u) (1 + 1 α ) (1 + ) if α 1 4-2

Preliminaries d(q, u) d(u,v) d(q, v) d(u,v) = d(q, v) d(q, u) 1 + Holds in any metric space: d(q, u) = αd(u, v) d(q, v) d(q, u) + d(u, v) = (1 + 1 α )d(q, u) = d(q,v) d(q,u) (1 + 1 α ) (1 + ) if α 1 Similarly: d(q, v) = αd(u, v) = d(q,u) d(q,v) (1 + 1 α ) (1 + ) if α 1 4-3

Preliminaries Moral: d(q, u) d(u,v) d(q, v) d(u,v) = d(q, v) d(q, u) 1 + Any of the far away points is a (1 + ) closest neighbor 4-4

Problem Statement: For a given, find a (1 + ) Aproximate Voronoi Diagram: Partition of space into regions with one representative r i per region, such that for any point q in region i, r i is a (1 + ) nearest neighbor of q 5-1

Problem Statement: For a given, find a (1 + ) Aproximate Voronoi Diagram: Partition of space into regions with one representative r i per region, such that for any point q in region i, r i is a (1 + ) nearest neighbor of q 5-2

Problem Statement: For a given, find a (1 + ) Aproximate Voronoi Diagram: Partition of space into regions with one representative r i per region, such that for any point q in region i, r i is a (1 + ) nearest neighbor of q Constraints: bounded construction time and space (complexity) Cover all space sub-linear (1+) NN queries 5-3

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries 6-1

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries 1) Construct balls of radius (1 + ) i around each point, for i = 1.. 6-2

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries 1) Construct balls of radius (1 + ) i around each point, for i = 1.. 6-3

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries 1) Construct balls of radius (1 + ) i around each point, for i = 1.. 6-4

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries 1) Construct balls of radius (1 + ) i around each point, for i = 1.. 6-5

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries 1) Construct balls of radius (1 + ) i around each point, for i = 1.. 6-6

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries 1) Construct balls of radius (1 + ) i around each point, for i = 1.. 6-7

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries 1) Construct balls of radius (1 + ) i around each point, for i = 1.. 6-8

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries 1) Construct balls of radius (1 + ) i around each point, for i = 1.. 6-9

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries (1 + ) 5 q > (1 + ) 4 For any query point q, return the center p of the smallest ball that contains it: d(q, n) > (1 + ) i 1, and d(q, p) (1 + ) i < (1 + ) d(q, n) = always get a (1 + )-Nearest Neighbor 6-10

ANN using PLEB Reduce (1 + )-ANN queries to target ball queries Problems: Unbounded Number of Balls Not clear how to preform target ball queries efficiently Partition the space into regions of influence 6-11

Bounding the number of balls Intuition: * For a given pair u and v, we only care if min d(q, {u, v}) [ d(u,v) +2, d(u,v) ] 7-1

Bounding the number of balls Intuition: * For a given pair u and v, we only care if min d(q, {u, v}) [ d(u,v) +2, d(u,v) ] if min d(q, {u, v}) > d(u,v) = either u or v are (1 + ) NN 7-2

Bounding the number of balls Intuition: * For a given pair u and v, we only care if min d(q, {u, v}) [ d(u,v) +2, d(u,v) ] if min d(q, {u, v}) > d(u,v) = either u or v are (1 + ) NN if min d(q, {u, v}) < d(u,v) +2 = q has a unique (1 + ) NN v u q d(q, u) < d(u,v) +2 d(q, v) > (1 1 +2 )d(u, v) > ( + 1)d(q, u) * Do not need to grow balls of radius smaller than d(u,v) 4 or larger than 2d(u,v) 7-3

Bounding the number of balls Intuition: * For a given pair u and v, we only care if min d(q, {u, v}) [ d(u,v) +2, d(u,v) ] if min d(q, {u, v}) > d(u,v) = either u or v are (1 + ) NN if min d(q, {u, v}) < d(u,v) +2 = q has a unique (1 + ) NN v u q d(q, u) < d(u,v) +2 d(q, v) > (1 1 +2 )d(u, v) > ( + 1)d(q, u) * Do not need to grow balls of radius smaller than d(u,v) 4 or larger than 2d(u,v) Method 1: for every pair of points {u, v}, construct enough balls to cover [ d(u,v) 4, 2d(u,v) ] on u, v 7-4

Bounding the number of balls Intuition: * For a given pair u and v, we only care if min d(q, {u, v}) [ d(u,v) +2, d(u,v) ] if min d(q, {u, v}) > d(u,v) = either u or v are (1 + ) NN if min d(q, {u, v}) < d(u,v) +2 = q has a unique (1 + ) NN v u q d(q, u) < d(u,v) +2 d(q, v) > (1 1 +2 )d(u, v) > ( + 1)d(q, u) * Do not need to grow balls of radius smaller than d(u,v) 4 or larger than 2d(u,v) Method 1: for every pair of points {u, v}, construct enough balls to cover [ d(u,v) Overall: O(n 2 log +1 ( 2C C 4 )) = O(n2 log(7c ) log(+1) ) = O(n2 1 log(1 )) balls Note: log(1 + ) = 2 /2 + 3 /3... = O() in most cases 4, 2d(u,v) ] on u, v 7-5

Bounding the number of balls Interval Near-Neighbor data structure given a range of distances [a, b], and a set of points P, answers: 1. d P (q) > b 2. d P (q) < a with a witness 3. otherwise, finds a point p P, s.t. d P (q) d(p, q) (1 + )d P (q) 8-1

Bounding the number of balls Interval Near-Neighbor data structure given a range of distances [a, b], and a set of points P, answers: 1. d P (q) > b 2. d P (q) < a with a witness 3. otherwise, finds a point p P, s.t. d P (q) d(p, q) (1 + )d P (q) Can be realized by a set of balls of radius a(1 + ) i for i = 0...M 1, where M = log 1+ (b/a) and a ball of radius b around every point in P 8-2

Bounding the number of balls Interval Near-Neighbor data structure given a range of distances [a, b], and a set of points P, answers: 1. d P (q) > b 2. d P (q) < a with a witness 3. otherwise, finds a point p P, s.t. d P (q) d(p, q) (1 + )d P (q) Can be realized by a set of balls of radius a(1 + ) i for i = 0...M 1, where M = log 1+ (b/a) and a ball of radius b around every point in P Contains O(n 1 log(b/a)) balls. Takes at most 2 target ball queries if 1 or 2 hold, and * O(log(M)) = O(log log(b/a) ) otherwise 8-3

Bounding the number of balls A data structure to answer (1 + )-ANN queries on general points Build a tree, with an Interval Near Neighbor structure associated with each node (Sariel Har-Peled: A Replacement for Voronoi Diagrams of Near Linear Size. FOCS 2001: 94-103) 9-1

Bounding the number of balls A data structure to answer (1 + )-ANN queries on general points Build a tree, with an Interval Near Neighbor structure associated with each node (Sariel Har-Peled: A Replacement for Voronoi Diagrams of Near Linear Size. FOCS 2001: 94-103) 9-2

Bounding the number of balls A data structure to answer (1 + )-ANN queries on general points Build a tree, with an Interval Near Neighbor structure associated with each node Recursively find min r such that there are n/2 connected components 10 I(P, r, 2 cµnr/, /4), r = 10, n = 6 I(P, r, 2 cµnr/, /4), r = 7, n = 3 7 (Sariel Har-Peled: A Replacement for Voronoi Diagrams of Near Linear Size. FOCS 2001: 94-103) 9-3

Bounding the number of balls A data structure to answer (1 + )-ANN queries on general points Build a tree, with an Interval Near Neighbor structure associated with each node Recursively find min r such that there are n/2 connected components 12 10 I(P, r, 2 cµnr/, /4), r = 10, n = 6 I(P, r, 2 cµnr/, /4), r = 7, n = 3 7 I(P, r, 2 cµnr/, /4), r = 12, n = 3 For each component find a representative and recursively build the outer tree (Sariel Har-Peled: A Replacement for Voronoi Diagrams of Near Linear Size. FOCS 2001: 94-103) 9-4

Bounding the number of balls A data structure to answer (1 + )-ANN queries on general points Given a query point q: 1) q is outside R descend into the outer tree I(P, r, 2 cµnr/, /4), r = 10, n = 6 q I(P, r, 2 cµnr/, /4), r = 7, n = 3 I(P, r, 2 cµnr/, /4), r = 12, n = 3 10-1 (Sariel Har-Peled: A Replacement for Voronoi Diagrams of Near Linear Size. FOCS 2001: 94-103)

Bounding the number of balls A data structure to answer (1 + )-ANN queries on general points Given a query point q: 2) if q is inside r descend into the cluster I(P, r, 2 cµnr/, /4), r = 10, n = 6 q I(P, r, 2 cµnr/, /4), r = 7, n = 3 I(P, r, 2 cµnr/, /4), r = 12, n = 3 10-2 (Sariel Har-Peled: A Replacement for Voronoi Diagrams of Near Linear Size. FOCS 2001: 94-103)

Bounding the number of balls A data structure to answer (1 + )-ANN queries on general points Given a query point q: I(P, r, 2 cµnr/, /4), r = 10, n = 6 3) otherwise I will return a (1 + 4 )-NN I(P, r, 2 cµnr/, /4), r = 7, n = 3 q I(P, r, 2 cµnr/, /4), r = 12, n = 3 10-3 (Sariel Har-Peled: A Replacement for Voronoi Diagrams of Near Linear Size. FOCS 2001: 94-103)

Bounding the number of balls A data structure to answer (1 + )-ANN queries on general points Given a query point q: I(P, r, 2 cµnr/, /4), r = 10, n = 6 I(P, r, 2 cµnr/, /4), r = 7, n = 3 I(P, r, 2 cµnr/, /4), r = 12, n = 3 Because of rounding up, after each step, continue on set containing n/2 + 1 points = number of steps log 3/2 n 10-4 (Sariel Har-Peled: A Replacement for Voronoi Diagrams of Near Linear Size. FOCS 2001: 94-103)

Bounding the number of balls 1) q is outside R descend into the outer tree 2) if q is inside r descend into the cluster 3) otherwise I will return a (1 + 4 )-NN Note that: last step is always 3) no error is incurred in 2) diameter of a cluster 2nr = error in 1) is at most (1 + cµ ) I(P, r, 2 cµnr/, /4), r = 10, n = 6 I(P, r, 2 cµnr/, /4), r = 7, n = 3 Thus, overall error is bounded by: (1+ log 3/2 n 4 ) (1+ cµ ) exp( log 3/2 n 4 ) exp( cµ ) exp log 3/2 n 4 + exp (/2) (1+) cµ i=1 if µ = log 3/2 n, c = 4 and < 1 i=1 i=1 I(P, r, 2 cµnr/, /4), r = 12, n = 3 11-1

Bounding the number of balls Overall Number of Balls: Since the depth of the tree is at most log 3/2 n each node ν has I(P ν, r, 2 cµnr/, /4) with M = n log n balls we get an immediate bound of O(M log M) = O(n log(n) log(n log n)) = O(n log 2 n) I(P, r, 2 cµnr/, /4), r = 10, n = 6 I(P, r, 2 cµnr/, /4), r = 7, n = 3 I(P, r, 2 cµnr/, /4), r = 12, n = 3 12-1 (Sariel Har-Peled: A Replacement for Voronoi Diagrams of Near Linear Size. FOCS 2001: 94-103)

Bounding the number of balls Overall Number of Balls: Since the depth of the tree is at most log 3/2 n each node ν has I(P ν, r, 2 cµnr/, /4) with M = n log n balls we get an immediate bound of O(M log M) = O(n log(n) log(n log n)) = O(n log 2 n) I(P, r, 2 cµnr/, /4), r = 10, n = 6 I(P, r, 2 cµnr/, /4), r = 7, n = 3 I(P, r, 2 cµnr/, /4), r = 12, n = 3 However, can achieve O(n log n) by considering the connection with the Cluster Tree 6 4 3 5 2 12-2 1 S. Sen, N. Sharma, Y. Sabharwal: Nearest Neighbors Search using Point Location in Balls with applications to approximate Voronoi Decompositions Journal of Computer and System Sciences, Volume 72(6), September 2006, Pages 955-977.

Bounding the number of balls Overall Number of Balls: Since the depth of the tree is at most log 3/2 n each node ν has I(P ν, r, 2 cµnr/, /4) with M = n log n balls we get an immediate bound of O(M log M) = O(n log(n) log(n log n)) = O(n log 2 n) I(P, r, 2 cµnr/, /4), r = 10, n = 6 I(P, r, 2 cµnr/, /4), r = 7, n = 3 I(P, r, 2 cµnr/, /4), r = 12, n = 3 However, can achieve O(n log n) by considering the connection with the Cluster Tree 6 4 3 5 2 12-3 1 S. Sen, N. Sharma, Y. Sabharwal: Nearest Neighbors Search using Point Location in Balls with applications to approximate Voronoi Decompositions Journal of Computer and System Sciences, Volume 72(6), September 2006, Pages 955-977.

Bounding the number of balls Overall Number of Balls: I(P, r, 2 cµnr/, /4), r = 10, n = 6 Since the depth of the tree is at most log 3/2 n I(P, r, 2 cµnr/, /4), r = 7, n = 3 each node ν has I(P ν, r, 2 cµnr/, /4) with M = n log n balls we get an immediate bound of O(M log M) = O(n log(n) log(n log n)) = O(n log 2 n) I(P, r, 2 cµnr/, /4), r = 12, n = 3 However, can achieve O(n log n) by considering the connection with the Cluster Tree 6 5 4 2 4 6 3 1 3 5 2 1 12-4

Bounding the number of balls Overall Number of Balls: I(P, r, 2 cµnr/, /4), r = 10, n = 6 Since the depth of the tree is at most log 3/2 n I(P, r, 2 cµnr/, /4), r = 7, n = 3 each node ν has I(P ν, r, 2 cµnr/, /4) with M = n log n balls we get an immediate bound of O(M log M) = O(n log(n) log(n log n)) = O(n log 2 n) I(P, r, 2 cµnr/, /4), r = 12, n = 3 However, can achieve O(n log n) by considering the connection with the Cluster Tree 6 5 4 2 4 6 3 1 3 5 r loss (2) 2 r loss (p) = radius of the ball around p, when p ceases to be a root 1 12-5

Bounding the number of balls 5 I(P, r, 2 cµnr/, /4), r = 10, n = 6 Apart from the outer trees, going down the (1 + ) ANN tree is equivalent to disconnecting edges of the MST tree 2 4 6 I(P, r, 2 cµnr/, /4), r = 12, n = 3 I(P, r, 2 cµnr/, /4), r = 7, n = 3 1 3 The subtrees of a node are disjoint in edges = can charge at least 1 edge to each child. 6 Namely: if n ν is the number of children of ν P ν = O(n ν ) and ν D n ν = O(n) Thus, total number of balls: ( nν O log µn ) ( ) ν n n log n = O log ν D ( n = O log n ) q 4 5 2 1 3 13-1

Construction Time Construction time will be dominated by constructing the tree D Can be constructed directly from the cluster tree but this takes time O(n 2 ) time 14-1

Construction Time Construction time will be dominated by constructing the tree D Can be constructed directly from the cluster tree but this takes time O(n 2 ) time In R d the cluster tree can be (2n 2)-approximated by a HST in O(n log n) time: 1. construct a 2-spanner of P of size O(n) in O(n log n) time 2. construct an HST that (n 1) approximates the spanner in O(n log n) time 14-2

Construction Time Construction time will be dominated by constructing the tree D Can be constructed directly from the cluster tree but this takes time O(n 2 ) time In R d the cluster tree can be (2n 2)-approximated by a HST in O(n log n) time: 1. construct a 2-spanner of P of size O(n) in O(n log n) time 2. construct an HST that (n 1) approximates the spanner in O(n log n) time Only possible in R d, in general no HST can be computed in subquadratic time 6 4 3 5 2 1 14-3

Construction Time Construction time will be dominated by constructing the tree D Can be constructed directly from the cluster tree but this takes time O(n 2 ) time In R d the cluster tree can be (2n 2)-approximated by a HST in O(n log n) time: 1. construct a 2-spanner of P of size O(n) in O(n log n) time 2. construct an HST that (n 1) approximates the spanner in O(n log n) time Only possible in R d, in general no HST can be computed in subquadratic time 18 24 6 4 9 4 2 1.5 3 6 12 5 3 2 1 1 2 3 4 5 6 1 14-4

Construction Time In R d the cluster tree can be (2n 2)-approximated by a HST in O(n log n) time: 1. construct a 2-spanner of P of size O(n) in O(n log n) time 2. construct an HST that (n 1) approximates the spanner in O(n log n) time To compensate for the approximation factor, grow more balls: Instead of I(P, r, 2 cµnr/, /4) construct I(P, r/(2n), 2 cµnr/, /4) 9 18 24 6 12 1 2 3 4 5 6 15-1

Construction Time In R d the cluster tree can be (2n 2)-approximated by a HST in O(n log n) time: 1. construct a 2-spanner of P of size O(n) in O(n log n) time 2. construct an HST that (n 1) approximates the spanner in O(n log n) time To compensate for the approximation factor, grow more balls: Instead of I(P, r, 2 cµnr/, /4) construct I(P, r/(2n), 2 cµnr/, /4) 9 18 24 Instead of O( n log b a ) = O(n log n) will have: O( n nr log r ) = O( n n log n2 ) = O( n log n) balls at every node 6 12 1 2 3 4 5 6 15-2

Construction Time In R d the cluster tree can be (2n 2)-approximated by a HST in O(n log n) time: 1. construct a 2-spanner of P of size O(n) in O(n log n) time 2. construct an HST that (n 1) approximates the spanner in O(n log n) time To compensate for the approximation factor, grow more balls: Instead of I(P, r, 2 cµnr/, /4) construct I(P, r/(2n), 2 cµnr/, /4) 9 18 24 Instead of O( n log b a ) = O(n log n) will have: O( n nr log r ) = O( n n log n2 ) = O( n log n) balls at every node 6 12 1 2 3 4 5 6 Same asymptotic space and time complexity 15-3

Answering ANN queries Haven t made our life easier, since answering target ball queries is a difficult problem 16-1

Answering ANN queries Haven t made our life easier, since answering target ball queries is a difficult problem Don t need exact balls (1 + ) ball r b is (1 + ) approximation of b = b(p, r), if b b b(p, r(1 + ) (1 + )r 16-2

Answering ANN queries Haven t made our life easier, since answering target ball queries is a difficult problem Don t need exact balls (1 + ) ball r b is (1 + ) approximation of b = b(p, r), if b b b(p, r(1 + ) (1 + )r 16-3

Answering ANN queries Haven t made our life easier, since answering target ball queries is a difficult problem Don t need exact balls (1 + ) ball r b is (1 + ) approximation of b = b(p, r), if b b b(p, r(1 + ) (1 + )r 16-4

Answering ANN queries Haven t made our life easier, since answering target ball queries is a difficult problem Don t need exact balls (1 + ) ball r b is (1 + ) approximation of b = b(p, r), if b b b(p, r(1 + ) (1 + )r Consider Interval Near Neighbor structure on approximate balls: If I (P, r, R, /16) is a (1 + /16) approximation to I(P, r, R, /16) If for point q, I (P, r, R, /16) returns a ball (p, α), α [r, R] = p is (1 + /4)-ANN to q: r(1 + /16) i d P (q) d(p, q) r(1 + /16) i+1 (1 + /16) (1 + /4)r 16-5

Fast ANN in R d The distance between 2 points in a d-dimensional cell of size α is at most d i=1 α2 = dα For a given ball, b(p, r), construct a grid centered at p, with cell-size 2 i, s.t. d2 i (r) 16 Call, b the set of cells that intersect b(p, r) ( b is a (1 + /16) approximate ball, and contains O ) r d (r) d ( ) d = O cells 1 17-1

Fast ANN in R d The distance between 2 points in a d-dimensional cell of size α is at most d i=1 α2 = dα For a given ball, b(p, r), construct a grid centered at p, with cell-size 2 i, s.t. d2 i (r) 16 Call, b the set of cells that intersect b(p, r) ( b is a (1 + /16) approximate ball, and contains O ) r d (r) d ( ) d = O cells 1 17-2

Fast ANN in R d Fix the origin, and construct grid-cells from there If there are 2 cells with the same size, pick the one, corresponding to the smallest ball Thus construct an approximate I-(1 + /16) data structure C 18-1

Fast ANN in R d Fix the origin, and construct grid-cells from there If there are 2 cells with the same size, pick the one, corresponding to the smallest ball Thus construct an approximate I-(1 + /16) data structure C Finding the smallest ball containing q finding the smallest grid-cell containing q 18-2

Fast ANN in R d Fix the origin, and construct grid-cells from there If there are 2 cells with the same size, pick the one, corresponding to the smallest ball Thus construct an approximate I-(1 + /16) data structure C Finding the smallest ball containing q finding the smallest grid-cell containing q Encode all the cells of C into a compressed quad-tree, such that each cell appears as a node Construction takes O( C log C ) time Finding the appropriate node in C takes O(log C ) time If information about smallest ball is propagated down the tree, answering a query takes O(log C ) 18-3

Fast ANN in R d Fix the origin, and construct grid-cells from there If there are 2 cells with the same size, pick the one, corresponding to the smallest ball Thus construct an approximate I-(1 + /16) data structure C Finding the smallest ball containing q finding the smallest grid-cell containing q Encode all the cells of C into a compressed quad-tree, such that each cell appears as a node Construction takes O( C log C ) time Finding the appropriate node in C takes O(log C ) time If information about smallest ball is propagated down the tree, answering a query takes O(log C ) Recall that we had a data structure with O( n log n ) balls. Each ball is approximated by O( 1 ) cells d The overall complexity of the quad-tree is O(N), where N = O( n log n d+1 ). By noticing that there are many balls of similar sizes, we reduce the complexity to: Construction: O(n d log 2 (n/) time Storage: O(n d log(n/) space Point location query: O(log(n/)) 18-4