Nearest Neighbor Search in Spatial

Size: px
Start display at page:

Download "Nearest Neighbor Search in Spatial"

Transcription

1 Nearest Neighbor Search in Spatial and Spatiotemporal Databases 1

2 Content 1. Conventional NN Queries 2. Reverse NN Queries 3. Aggregate NN Queries 4. NN Queries with Validity Information 5. Continuous NN Monitoring 6. NN in Non-Euclidean Spaces 7. Private NN Queries 2

3 1. Conventional Nearest Neighbor (NN) Query Given a query location q, find the nearest object. a q Depth First and Best-First Search using R-(TPR-) trees Our goal: avoid visiting nodes that cannot contain results 3

4 Basic Pruning Metric: MINDIST Minimum distance between qand a minimum bounding rectangle MBR. E 1 mindist(e 1,q) q p mindist(e 1,q) is a lower bound of d(o,q) for every object o in E 1. If we have found a candidate NN p, we can prune every MBR whose mindist> d(p, q). 4

5 E 1 Depth-First (DF) NN Algorithm y axis e b c f g E 4 E 5 E 1 d E 3 h a E 2 E 6 i query m j k E 7 x axis Root E 1 E E 3 E 4 E 5 E 6 E a b c d e f g i j h k l m E E 3 4 E 5 E E 6 7 l Note: distances not actually stored inside nodes. Only for illustration E2

6 DF Search Visit E 1 y axis m E 2 g h E 6 e f d E b 3 a c E 4 E 5 E 1 i query E 7 l k j l m x axis Root E 1 E E E E E E E E2 a b c d e f g i j 5 h 2 k E 3 E 4 E 5 E 6 E 7

7 DF Search Find Candidate NN a y axis m E 2 g h E 6 e f d E b 3 a c E 4 E 5 E 1 i query E 7 l k j l m x axis Root E 1 E 2 First Candidate NN: a with distance E E E E E E E2 a b c d e f g i j h 5 2 k E 3 E 4 E 5 E 6 E 7

8 DF Search Backtrack to Root and Visit E y axis m E 2 g h E 6 e f d E b 3 a c E 4 E 5 E 1 i query E 7 l k j x axis Root E 1 E First Candidate NN: a with distance 5 E E E E E E a b c d e f g i j 5 h 2 E2 k E 3 E 4 E 5 E 6 E 7

9 E 1 DF Search Find Actual NN i y axis e b c f g d E 3 h E 4 E 5 E 1 a Root E 1 E E 2 E 6 i m query j k E 7 x axis E 3 E 4 E 5 E 6 E l First Candidate NN: a with distance 5 Actual NN: i with distance 2 a b c d e f g i j h k l m 5 2 E E 3 4 E 5 E E 6 7 E2

10 y axis e b c f g d E 3 Optimality h E 4 E 5 E 1 a E 2 E 6 i m query j Question: Which is the minimal set of nodes that must be visited by any NN algorithm? Answer: The set of nodes whose MINDIST is smaller than or equal to the distance between qand its NN (e.g., E 1, E 2, E 6 ). k E 7 x axis l 10

11 Best-First (BF) NN Algorithm (Optimal) Keep a heap Hof index entries and objects, ordered by MINDIST. Initially, H contains the root. While H φ Extract the element with minimum MINDIST If it is an index entry, insert its children into H. If it is an object, return it as NN. End while 11

12 BF Search Visit root y axis m E 2 g h E 6 e f d E b 3 a c E 4 E 5 E 1 i query E 7 l k j x axis Root E 1 E Action Visit Root Heap E 1 1 E 2 2 E E E E E E E a b c d e f g i j h k l m E E 3 4 E 5 E E 6 7

13 y axis e b c f g d E 3 h E 4 E 5 E 1 a E 2 E 6 i m query j BF Search Visit E 1 k E 7 x axis l Root Action Visit Root Heap E 1 1 follow E 1 E 2 2 E 2 2 E 3 5 E 5 5 E 9 4 E 1 E E E E E E E E a b c d e f g i j h k l m E E 3 4 E 5 E E 6 7

14 y axis e b c f g d E 3 h E 4 E 5 E 1 a E 2 E 6 i m query j BF Search Visit E 2 k E 7 x axis l Root E 1 E Action Visit Root Heap E 1 1 follow E 1 E 2 2 follow E 2 E 6 2 E 2 2 E 3 5 E 5 3 E 5 5 E 5 5 E 9 4 E 9 4 E 13 7 E E E E E E E a b c d e f g i j h k l m E E 3 4 E 5 E E 6 7

15 BF Search Visit E 6 y axis e b f g d E 3 h E 4 E 5 E 1 a E 2 E 6 i m query j k E 7 l Action Visit Root Heap E 1 1 follow E 1 E 2 2 follow E 2 E 6 2 follow E 6 E 2 2 E 3 5 E 5 3 i 2 E 5 3 E 5 5 E 5 5 E 5 5 E 9 4 E 9 4 E 9 4 E 7 13 j 10 E 7 13 k 13 2 c x axis Root E 1 E E E E E E E E a b c d e f g i j h k l m E E 3 4 E 5 E E 6 7

16 y axis e b c f g d E 3 h E 4 E 5 E 1 a BF Search Find Actual NN i E 2 E 6 i m query j k E 7 x axis l Root E 1 E Action Visit Root Heap E 1 1 follow E 1 E 2 2 follow E 2 E 6 2 follow E 6 E 2 2 E 3 5 E 5 3 i 2 E 5 3 E 5 5 E 5 5 E 5 5 Report i and terminate E 9 4 E 9 4 E 9 4 E 7 13 j 10 E 7 13 k 13 E E E E E E E a b c d e f g i j h k l m E E 3 4 E 5 E E 6 7

17 Discussion about Conventional NN Queries (1) Both DF and BF can be easily adapted to (i) extended (instead of point) objects and (ii) retrieval of k(>1) NN. BF is incremental; i.e., it can report the NN in increasing order of distance without a given value of k. Example: find the 10 closest cities to HK with population more than 1 million. We may have to retrieve many (>>10) cities around Hong Kong in order to answer the query. 17

18 Discussion about Conventional NN Queries (2) Both DF and BF can be used to process NN queries on moving objects indexed by TPR-trees: find my NN during the next minute (based on the current motion). q a b 18

19 2. Reverse NN Queries Monochromatic: given a multi-dimensional dataset Pand a point q, find all the points p Pthat have qas their nearest neighbor Bichromatic: given a set Qof queries and a query point q, find the objects p Pthat are closer to qthan any other point ofq p 1 q p 3 p 4 RNN(q)=p 1, p 2 NN(q)= p 3 p 2 19

20 KM Algorithm (for static datasets) Find the NN of every data point p-let the vicinitycircle(p, dist(p,nn(p))) centeredatpwithradius equalto the Euclideandistance betweenpand itsnn. Index the MBRsof all circleswithan R-tree, calledthe RNN-tree The reverse nearestneighborsof qare retrievedby a point location queryon the RNN-tree, whichreturnsall circles that contain q. 20

21 SAA Algorithm (supports updates) Divide the space around the query qinto six equal regions S 1 to S 6. Find the NN p i of qin each region S i Find the NN of each p i if distance(p i, NN(p i )) < distance(p i, q) there is no RNN of qin S i otherwise, the only RNN of qin S i is p i. p 2 p 1 60 o p 3 S 1 S 2 S 3 p 4 60 o q p 5 p 6 p7 S 4 S 6 S 5 21

22 TPL Algorithm (supports updates, >2 dimensions, k-rnn) Filter-refinement approach Find the set S cnd of candidate points Find neighbors of the query point q incrementally Every new neighbor prunes the search space Continue until the entire space is pruned Keep all the pruned points and nodes in a set S rfn Refinement step: eliminate false positives from S cnd First NN N 1 p' p1 N 2 p1 (p, q) perpendicular bisector ( p 1, q ) q ( p 2, q) p2 q Second NN 22

23 Example of TPL 23

24 Filter Step Visit Root Action Heap S cnd S rfn visit root {N 10,N 11,N 12 } 24

25 Filter Step Visit N 10 Action Heap S cnd S rfn visit N 10 {N 3,N 11,N 2,N 1,N 12 } 25

26 Filter Step Visit N 3 Action Heap S cnd S rfn visit N 3 {N 11,N 2,N 1,N 12 } {p 1 } {p 3 } 26

27 Filter Step Visit N 11 Action Heap S cnd S rfn visit N 11 {N 5,N 2,N 1,N 12 } {p 1 } {p 3,N 4,N 6 } 27

28 Filter Step Visit N 5 Action Heap S cnd S rfn visit N 5 {N 2,N 1,N 12 } {p 1,p 2 } {p 3,N 4,N 6,p 6 } 28

29 Filter Step Visit N 1 Action Heap S cnd S rfn visit N 1 {N 12 } {p 1,p 2,p 5 } {p 3,N 4,N 6,p 6,N 2,p 7 } 29

30 Example (end of filter step) Action Heap S cnd S rfn {p 1,p 2,p 5 } {p 3,N 4,N 6,p 6,N 2,p 7,N 12 } 30

31 Refinement Step Nodes outside the cicrles centered at the candidates can be discarded p 3 invalidates candidate p 1 In order to verify candidate p 2 we need to visit N 4 p 5 is an actual RNN Other Work: Bichromatic RNN Road Networks Metric Spaces Continuous RNN 31

32 3. Aggregate NN Queries Given a set Pof data points and a set Qof query points, an aggregate NN query returns the data point pwith the minimum aggregate distance. The aggregate distancebetween a data point pand Q={q 1,.., q n }is defined as adist(p,q) = f( pq 1,, pq n ), where pq i is the Euclidean distance of pand q i. We use f=sumin the examples. 32

33 Multiple Query Method (MQM) Idea based on the thresholdalgorithm for top-kqueries: Perform incrementalnnqueries for each point in Qand combine their results. <p 10, 7>, <p 11, 6>, Threshold T=5 (2+3) Every non-yet visited point must have distance T=5 Since the best candidate found p 11 has distance > T we must continue search 33

34 Multiple Query Method (MQM) We find p 11 again, this time as the second NN of q 1 The value of the threshold is updated to T=6 We already have a candidate with distance equal to T=6 Search terminates 34

35 Minimum Bounding Method (MBM) MQM applies multiple NN queries and may visit the same node and data point multiple times MBM Applies the MBR of Qto prune the search space with a single (DF or BF) traversal, using heuristics to prune the search space Heuristic 1:Let Mbe the MBR of Q, and best_distbe the distance of the best points found so far. A node Ncannot contain qualifying points, if: best_dist mindist( N, M ) n Heuristic 2: A node N cannot contain qualifying points, if: q Q i mindist( N, q ) i best_dist

36 4. NN Queries with Validity Information The result of conventional NN queries may be invalidated as soon as the query or some data object(s) move Goal: To return some additional information that determines the validityof the current result The validity regionis such that the NN set remains the same as long as q remains in the region. The validity region of a NN query qis the VoronoiCell of the NN o. 36

37 Settings: Static Objects, Moving Query. Client/server architecture A central server indexes the Voronoi diagram of the objects Given a query, the server returns the current NN and the validity timeof the result. The current NN is the object whose Voronoi cell covers the query point Restrictions:(i) assumes a maximum speed (ii) applicable only to single NN (iii) and static objects. Validity Time 37

38 Redundant Results Settings: moving client/query, static objects. The client needs to keep updated about the NN set as it moves. Goal: Minimize the number of queries by returning m>k NNs. Example: A client a location q wants to know its NN Server returns m=4 NNs {o,a,c,b} If 2 dist(q,q') dist(q,b)-dist(q,o), THEN the NN at q' is among the 4 NN of the first query. Problem: how to determine m. 38

39 Time parameterized NN Assuming that queries and/or objects move linearly, a TPNN returns: The current NN set R The validity period Tof R The change Cof the result at the end of T y axis e f g h i m k j l Result: R={i}, T=2, C={j} 4 2 b c d query (speed 1) a its position at time 2 x axis

40 Processing TP NN with R-(TPR-) trees Influence time of a MBR: the earliest possible time that any object in the MBR will become the new NN. Algorithm: first compute the conventional NN Then traverse the R-tree (TPRtree) again, visiting nodes in ascending order of their influence time (instead of the mindist). Cost of TPNN queries about the same as that of conventional queries because we have to visit the influencing nodes anyway (to find the NN) y axis P NN q (speed 1) x axis E position at time 2 at this time q has equal distance to P NN and E 40

41 Location Based NN queries (LBNN) A location-based knn query q returns The current knns The validity regionsuch that the result remains the same as long as qremains in the region. The validity region of qis the Voronoi Cell(VC) of the NN o. Based on TPNN queries.

42 Continuous Nearest Neighbors (CNN) Goal: Given a line segment q=[s,e], find the NN of everypoint on q. Result representation: {s(.nn=a), s 1 (.NN=c), s 2 (.NN=f), s 3 (.NN=h), e}. The points (s, s 1, s 2, s 3,e) are the split points, meaning that ais the NN for the interval [s, s 1 ), cis the NN for the interval [s 1, s 2 ) etc. b d s s 1 s 2 s 3 f e a c g h 42

43 Main idea Maintain the set of split points incrementally. After processing a After processing c 43

44 Processing with an R-(TPR-) tree Avoid examination of all points. Given an MBR Eand query segment q, Emust be searched if and only if there exists a split point s i SL such that dist(s i,s i.nn) > mindist(s i, E). 44

45 5. CNN monitoring Continuous NN monitoring: Assumeasetofk-NNqueriesoverasetofmovingobjects. Numerous queries and objects move in an unpredictable manner. A central server continuously monitors the k NNs of each query and reports to the clients. Goal: Minimize CPU overhead at the server and/or network overhead of location updates. Usually main-memory processing using simple grids(r-trees incur expensive updates) 45

46 YPK-CNN Initial Result Computation Visit the cells in a square R around the cell c q covering q until you find the first object (at distance d). Search the cells intersecting the square SR (search region) centered at c q with side length 2 d+δ, and determine the actual NN set. Result Update Compute the maximum distance d max of the current locations of the previous NNs (i.e., d max is the distance of the previous neighbor that moved furthest). The new SR is a square centered at c q with side length 2 d max +δ. 46

47 SEA-CNN SEA-CNN assumes that the initial result has been computed. Let best_dist be the distance of the NN Result Update Similar to YPK-CNN but with circular SR if the query q moves to a new location q, then SEA-CNN sets r =best_dist + dist(q,q ), and computes the new NN set by processing all the objects that lie in the circle centered at q with radius r. p 7 p 3 p 2 Answer region best_dist p 1 q p 6 p 4 p 5 best_dist+dist(q,q') q' SR p 8 p 10 p 9 47

48 CPM Conceptual Partitioning Update handling when Motivation: when computing outgoing objects are more the initial result we only visit than incoming ones the minimum number of cells (without computing the distance of all cells) Update handling when incoming objects are at least as many as outgoing ones p' 2 p 3 p' 3 p 2 best_dist q p 1 Influence region 48

49 Threshold-based NN 1. YPK-CNN, SEA-CNN, CPM assume that objects issue updates whenever they move (even if they do not influence a query). Their goal is to minimize the computation overhead at the server. 2. TB minimizes the network overhead for location updates(orthogonal to previous algorithms). 3. Main idea: avoid object updates when their movement does not affect any query. 4. Thresholds to objects: No violations no result change, no need for updates. 5. Applicable to non-euclidean spaces and arbitrary movement. 49

50 Related Work on Moniroring Range query monitoring. Approximate NN monitoring. Minimization of network overhead. Continuous Monitoring of Reverse NN. NN processing using air indexes in broadcasting environments. 50

51 6: NN queries in Non-Euclidean Spaces Road Networks Find my nearest hotel in terms of driving distance. Answer: Hotel b(the Euclidean NN is d) 51

52 Euclidean Restriction Algorithm 1st Euclidean NN 2nd Euclidean NN 52

53 Network Expansion Algorithm n 4 1 p 5 2 n n p 3 n 5 p 2 q 3 p 1 2 n 1 n 6 p n 8 n 7 n 9 53

54 Related Work on Road NN knns based on pre-computed network Voronoi cells. Path knn queries (k-nns for every point on a trajectory). SurfacekNN Query Processing. Continuous Monitoring. 54

55 NN in the presence of obstacles The NN of qin terms of obstructed distance is b, although the Euclidean NN is a. 55

56 Visibility graphs Have been used widely in Computational Geometry for shortest path problems (e.g., find the shortest path from p start to p end that does not cross any obstacle). p start p end Problem: We cannot maintain the entire visibility graph in memory for real spatial datasets. Solution: We only need the obstacles and objects that affect the result of the query. 56

57 Obstacle nearest neighbor algorithm Idea: Similar to the Euclidean Restriction algorithm for road networks. BUT how do we perform the obstructed distance computations? 57

58 Obstructed distance computation Goal: compute the obstructed distance between p and q. First retrieve obstacles o 1, o 2 in the Euclidean range. Compute a provisional distance d 1 (p,q) using only o 1, o 2. d 1 (p,q) is not enough because the shortest path is obstructed by o 3. Perform a second Euclidean range query on the obstacle R-tree using d 1 (p,q) and retrieve o 3, o 4. Compute a new obstructed distance d 2 (p,q) taking o 3, o 4 into account. Repeat the process until the obstructed distance remains the same for two consecutive iterations. 58

59 Related work Visible Nearest Neighbor Queries: We only care about objects visible from the query point. Visible Reverse Nearest Neighbors. Continuous Obstructed Nearest Neighbor Queries. 59

60 7. Private NN Queries Anonymizer model 1. A user/client Usends its NN query and anonymity requirementkto the anonymizer, which maintains the current locations of numerous clients. 2. The anonymizer removes the user ID and selects an anonymizing set(as) that contains Uand at least K-1 other clients in its vicinity. The K-anonymizing spatial region(k-asr or ASR) is an area that spatially encloses AS. 3. The anonymizer forwards the ASR to the LBS that stores the spatial data. 4. (iv) The LBS processes the query and returns to the anonymizer a set of candidate results. 5. The anonymizer removes the false hits and forwards the actual result to U. user U actual position, query, K secure connection actual results ASR, query n o n s e c u r e c o n n e c t i o n candidate results Anonymizer LBS location data objects

61 Casper The anonymizer maintains the locations of the clients using a pyramid data structure, similar to a Quad-tree, where the minimum cell size corresponds to the anonymity resolution. Once the anonymizer receives a query from U, it uses a hash table on the user ID pointing to the lowest-level cell cwhere U lies. If ccontains enough users (i.e., c K) for the anonymity requirements, it forms the ASR. Otherwise ( c <K), the horizontal c h and vertical c v neighbors of care retrieved. If c c h Kor c c v K, the corresponding union of cells becomes the ASR. On the other hand, if c c h < Kand c c v < K, the anonymizer retrieves the parent of c and repeats this process recursively. 61

62 Assume a query qwith K=2. Casper Example If qis issued by U 1 oru 2, the ASR is cell (0,2), (1,3). If qis issued by U 3 oru 4, the ASR is the union of cells (1,2), (2,3) (1,3), (2,4). If qis issued by U 5, the ASR is the entire data space. (1,4) U 4 (2,4) (4,4) Problem of Casper: (0,3) (0,2) U 1 (1,3) U 2 U3 (1,2) (2,1) (2,3) (2,2) U 5 (3,2) (4,2) Although U 1 to U 4 are in the 2-ASR of U 5, U 5 is not in the 2-ASR of any of those users. Consequently, an attacker that detects an ASR covering the entire space can infer with high probability that it originates from U 5. (0,0) (2,0) 62

63 Reciprocity Property Consider a user Uissuing a query with anonymity degree K, anonymizing set AS, and anonymizing spatial region ASR. AS satisfies reciprocityif (i) it contains Uand at least K-1 additional users, and (ii) every user in AS also generates the same anonymizing set AS for the given K. Consider an algorithm Center Cloak that, given a query from U, finds its K-1 closest users and sets the ASR as the MBR that encloses them. Given a query from U 3, U 4, or U 5, Center Cloak would generate the top ASR. If the query originates from U 6, it would generate the bottom ASR. The second ASR violates anonymity because an attacker can be sure that the query is issued by U 6, as the same ASR could not be generated for any other client. Specifically, any AS involving 3 users and created for U 4 or U 5, would contain U 3, but not U 6. 4 y y U 2 U 1 U 2 U 1 U U U 3 U 4 3-ASR for U 3,U 4, or U 5 U 5 U 3 U ASR for U 6 U x x

64 Hilbert Cloak Uses the Hilbert space filling curve to map the 2-D space into 1-D values. These values are then indexed by an annotated B+-tree, which supports efficient search by value or by rank (i.e., position in the 1-D sorted list). The algorithm partitions the 1-D sorted list into groups of Kusers (the last group may have up to 2K-1 users). For a querying user Uthe algorithm finds the group where Ubelongs, and returns the MBR of the group as the ASR. The same ASR is returned for any user in a given group; therefore the algorithm is reciprocal. U 10 U 8 Buckets for K=3 U 9 U 6 U 7 U 1 U 2 U 3 U 4 U 5 U 6 U 7 U 8 U 9 U 10 Buckets for K=4 U 2 U 5 U 1 U 2 U 3 U 4 U 5 U 6 U 7 U 8 U 9 U 10 U 3 U 1 Hilbert Curve U 4 64

65 Additional Topics Location privacy without anonymizer Private information retrieval Anonymization of trajectories Query processing techniques for anonymized queries 65

Spatial Database. Ahmad Alhilal, Dimitris Tsaras

Spatial Database. Ahmad Alhilal, Dimitris Tsaras Spatial Database Ahmad Alhilal, Dimitris Tsaras Content What is Spatial DB Modeling Spatial DB Spatial Queries The R-tree Range Query NN Query Aggregation Query RNN Query NN Queries with Validity Information

More information

Query Processing in Spatial Network Databases

Query Processing in Spatial Network Databases Temporal and Spatial Data Management Fall 0 Query Processing in Spatial Network Databases SL06 Spatial network databases Shortest Path Incremental Euclidean Restriction Incremental Network Expansion Spatial

More information

Skylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland

Skylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland Yufei Tao ITEE University of Queensland Today we will discuss problems closely related to the topic of multi-criteria optimization, where one aims to identify objects that strike a good balance often optimal

More information

Nearest Neighbor Search with Keywords

Nearest Neighbor Search with Keywords Nearest Neighbor Search with Keywords Yufei Tao KAIST June 3, 2013 In recent years, many search engines have started to support queries that combine keyword search with geography-related predicates (e.g.,

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Tailored Bregman Ball Trees for Effective Nearest Neighbors

Tailored Bregman Ball Trees for Effective Nearest Neighbors Tailored Bregman Ball Trees for Effective Nearest Neighbors Frank Nielsen 1 Paolo Piro 2 Michel Barlaud 2 1 Ecole Polytechnique, LIX, Palaiseau, France 2 CNRS / University of Nice-Sophia Antipolis, Sophia

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13 Indexes for Multimedia Data 13 Indexes for Multimedia

More information

Evaluation of Probabilistic Queries over Imprecise Data in Constantly-Evolving Environments

Evaluation of Probabilistic Queries over Imprecise Data in Constantly-Evolving Environments Evaluation of Probabilistic Queries over Imprecise Data in Constantly-Evolving Environments Reynold Cheng, Dmitri V. Kalashnikov Sunil Prabhakar The Hong Kong Polytechnic University, Hung Hom, Kowloon,

More information

Multimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data

Multimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data 1/29/2010 13 Indexes for Multimedia Data 13 Indexes for Multimedia Data 13.1 R-Trees Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

More information

Efficient Continuously Moving Top-K Spatial Keyword Query Processing

Efficient Continuously Moving Top-K Spatial Keyword Query Processing Efficient Continuously Moving Top-K Spatial Keyword Query Processing Dingming Wu Man Lung Yiu 2 Christian S. Jensen 3 Gao Cong 4 Department of Computer Science, Aalborg University, Denmark, dingming@cs.aau.dk

More information

Nearest Neighbor Search with Keywords in Spatial Databases

Nearest Neighbor Search with Keywords in Spatial Databases 776 Nearest Neighbor Search with Keywords in Spatial Databases 1 Sphurti S. Sao, 2 Dr. Rahila Sheikh 1 M. Tech Student IV Sem, Dept of CSE, RCERT Chandrapur, MH, India 2 Head of Department, Dept of CSE,

More information

THE AREA CODE TREE FOR APPROXIMATE NEAREST NEIGHBOUR SEARCH IN DENSE POINT SETS. FATEMA RAHMAN Bachelor of Science, Jahangirnagar University, 2007

THE AREA CODE TREE FOR APPROXIMATE NEAREST NEIGHBOUR SEARCH IN DENSE POINT SETS. FATEMA RAHMAN Bachelor of Science, Jahangirnagar University, 2007 THE AREA CODE TREE FOR APPROXIMATE NEAREST NEIGHBOUR SEARCH IN DENSE POINT SETS FATEMA RAHMAN Bachelor of Science, Jahangirnagar University, 2007 A Thesis Submitted to the School of Graduate Studies of

More information

Outline for today. Information Retrieval. Cosine similarity between query and document. tf-idf weighting

Outline for today. Information Retrieval. Cosine similarity between query and document. tf-idf weighting Outline for today Information Retrieval Efficient Scoring and Ranking Recap on ranked retrieval Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University Efficient

More information

arxiv: v2 [cs.db] 5 May 2011

arxiv: v2 [cs.db] 5 May 2011 Inverse Queries For Multidimensional Spaces Thomas Bernecker 1, Tobias Emrich 1, Hans-Peter Kriegel 1, Nikos Mamoulis 2, Matthias Renz 1, Shiming Zhang 2, and Andreas Züfle 1 arxiv:113.172v2 [cs.db] May

More information

Monochromatic and Bichromatic Reverse Skyline Search over Uncertain Databases

Monochromatic and Bichromatic Reverse Skyline Search over Uncertain Databases Monochromatic and Bichromatic Reverse Skyline Search over Uncertain Databases Xiang Lian and Lei Chen Department of Computer Science and Engineering Hong Kong University of Science and Technology Clear

More information

Exact and Approximate Flexible Aggregate Similarity Search

Exact and Approximate Flexible Aggregate Similarity Search Noname manuscript No. (will be inserted by the editor) Exact and Approximate Flexible Aggregate Similarity Search Feifei Li, Ke Yi 2, Yufei Tao 3, Bin Yao 4, Yang Li 4, Dong Xie 4, Min Wang 5 University

More information

Scalable Processing of Snapshot and Continuous Nearest-Neighbor Queries over One-Dimensional Uncertain Data

Scalable Processing of Snapshot and Continuous Nearest-Neighbor Queries over One-Dimensional Uncertain Data VLDB Journal manuscript No. (will be inserted by the editor) Scalable Processing of Snapshot and Continuous Nearest-Neighbor Queries over One-Dimensional Uncertain Data Jinchuan Chen Reynold Cheng Mohamed

More information

Multi-Dimensional Top-k Dominating Queries

Multi-Dimensional Top-k Dominating Queries Noname manuscript No. (will be inserted by the editor) Multi-Dimensional Top-k Dominating Queries Man Lung Yiu Nikos Mamoulis Abstract The top-k dominating query returns k data objects which dominate the

More information

A Survey on Spatial-Keyword Search

A Survey on Spatial-Keyword Search A Survey on Spatial-Keyword Search (COMP 6311C Advanced Data Management) Nikolaos Armenatzoglou 06/03/2012 Outline Problem Definition and Motivation Query Types Query Processing Techniques Indices Algorithms

More information

Location-Based R-Tree and Grid-Index for Nearest Neighbor Query Processing

Location-Based R-Tree and Grid-Index for Nearest Neighbor Query Processing ISBN 978-93-84468-20-0 Proceedings of 2015 International Conference on Future Computational Technologies (ICFCT'2015) Singapore, March 29-30, 2015, pp. 136-142 Location-Based R-Tree and Grid-Index for

More information

Multi-Approximate-Keyword Routing Query

Multi-Approximate-Keyword Routing Query Bin Yao 1, Mingwang Tang 2, Feifei Li 2 1 Department of Computer Science and Engineering Shanghai Jiao Tong University, P. R. China 2 School of Computing University of Utah, USA Outline 1 Introduction

More information

Pavel Zezula, Giuseppe Amato,

Pavel Zezula, Giuseppe Amato, SIMILARITY SEARCH The Metric Space Approach Pavel Zezula Giuseppe Amato Vlastislav Dohnal Michal Batko Table of Content Part I: Metric searching in a nutshell Foundations of metric space searching Survey

More information

Association Rules. Fundamentals

Association Rules. Fundamentals Politecnico di Torino Politecnico di Torino 1 Association rules Objective extraction of frequent correlations or pattern from a transactional database Tickets at a supermarket counter Association rule

More information

High-Dimensional Indexing by Distributed Aggregation

High-Dimensional Indexing by Distributed Aggregation High-Dimensional Indexing by Distributed Aggregation Yufei Tao ITEE University of Queensland In this lecture, we will learn a new approach for indexing high-dimensional points. The approach borrows ideas

More information

Composite Quantization for Approximate Nearest Neighbor Search

Composite Quantization for Approximate Nearest Neighbor Search Composite Quantization for Approximate Nearest Neighbor Search Jingdong Wang Lead Researcher Microsoft Research http://research.microsoft.com/~jingdw ICML 104, joint work with my interns Ting Zhang from

More information

A safe region based approach to moving KNN queries in obstructed space

A safe region based approach to moving KNN queries in obstructed space Knowl Inf Syst DOI.7/s115-014-0803-6 REGULAR PAPER A safe region based approach to moving KNN queries in obstructed space Chuanwen Li Yu Gu Jianzhong Qi Rui Zhang Ge Yu Received: 26 April 2013/Revised:

More information

5 Spatial Access Methods

5 Spatial Access Methods 5 Spatial Access Methods 5.1 Quadtree 5.2 R-tree 5.3 K-d tree 5.4 BSP tree 3 27 A 17 5 G E 4 7 13 28 B 26 11 J 29 18 K 5.5 Grid file 9 31 8 17 24 28 5.6 Summary D C 21 22 F 15 23 H Spatial Databases and

More information

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. SAMs - Detailed outline. Spatial Access Methods - problem

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. SAMs - Detailed outline. Spatial Access Methods - problem Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #26: Spatial Databases (R&G ch. 28) SAMs - Detailed outline spatial access methods problem dfn R-trees Faloutsos 2

More information

A Novel Probabilistic Pruning Approach to Speed Up Similarity Queries in Uncertain Databases

A Novel Probabilistic Pruning Approach to Speed Up Similarity Queries in Uncertain Databases A Novel Probabilistic Pruning Approach to Speed Up Similarity Queries in Uncertain Databases Thomas Bernecker #, Tobias Emrich #, Hans-Peter Kriegel #, Nikos Mamoulis, Matthias Renz #, Andreas Züfle #

More information

Flexible Aggregate Similarity Search

Flexible Aggregate Similarity Search Flexible Aggregate Similarity Search Yang Li Feifei Li 2 Ke Yi 3 Bin Yao 2 Min Wang 4 rainfallen@sjtu.edu.cn, {lifeifei, yao}@cs.fsu.edu 2, yike@cse.ust.hk 3, min.wang6@hp.com 4 Shanghai JiaoTong University,

More information

12 Review and Outlook

12 Review and Outlook 12 Review and Outlook 12.1 Review 12.2 Outlook http://www-kdd.isti.cnr.it/nwa Spatial Databases and GIS Karl Neumann, Sarah Tauscher Ifis TU Braunschweig 926 What are the basic functions of a geographic

More information

I may not have gone where I intended to go, but I think I have ended up where I needed to be. Douglas Adams

I may not have gone where I intended to go, but I think I have ended up where I needed to be. Douglas Adams Disclaimer: I use these notes as a guide rather than a comprehensive coverage of the topic. They are neither a substitute for attending the lectures nor for reading the assigned material. I may not have

More information

D B M G Data Base and Data Mining Group of Politecnico di Torino

D B M G Data Base and Data Mining Group of Politecnico di Torino Data Base and Data Mining Group of Politecnico di Torino Politecnico di Torino Association rules Objective extraction of frequent correlations or pattern from a transactional database Tickets at a supermarket

More information

The direction-constrained k nearest neighbor query

The direction-constrained k nearest neighbor query Geoinformatica (2016) 20:471 502 DOI 10.1007/s10707-016-0245-2 The direction-constrained k nearest neighbor query Dealing with spatio-directional objects Min-Joong Lee 2 Dong-Wan Choi 3 SangYeon Kim 2

More information

LOCALITY SENSITIVE HASHING FOR BIG DATA

LOCALITY SENSITIVE HASHING FOR BIG DATA 1 LOCALITY SENSITIVE HASHING FOR BIG DATA Wei Wang, University of New South Wales Outline 2 NN and c-ann Existing LSH methods for Large Data Our approach: SRS [VLDB 2015] Conclusions NN and c-ann Queries

More information

D B M G. Association Rules. Fundamentals. Fundamentals. Elena Baralis, Silvia Chiusano. Politecnico di Torino 1. Definitions.

D B M G. Association Rules. Fundamentals. Fundamentals. Elena Baralis, Silvia Chiusano. Politecnico di Torino 1. Definitions. Definitions Data Base and Data Mining Group of Politecnico di Torino Politecnico di Torino Itemset is a set including one or more items Example: {Beer, Diapers} k-itemset is an itemset that contains k

More information

D B M G. Association Rules. Fundamentals. Fundamentals. Association rules. Association rule mining. Definitions. Rule quality metrics: example

D B M G. Association Rules. Fundamentals. Fundamentals. Association rules. Association rule mining. Definitions. Rule quality metrics: example Association rules Data Base and Data Mining Group of Politecnico di Torino Politecnico di Torino Objective extraction of frequent correlations or pattern from a transactional database Tickets at a supermarket

More information

5 Spatial Access Methods

5 Spatial Access Methods 5 Spatial Access Methods 5.1 Quadtree 5.2 R-tree 5.3 K-d tree 5.4 BSP tree 3 27 A 17 5 G E 4 7 13 28 B 26 11 J 29 18 K 5.5 Grid file 9 31 8 17 24 28 5.6 Summary D C 21 22 F 15 23 H Spatial Databases and

More information

CS/COE

CS/COE CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ P vs NP But first, something completely different... Some computational problems are unsolvable No algorithm can be written that will always produce the correct

More information

CS246 Final Exam, Winter 2011

CS246 Final Exam, Winter 2011 CS246 Final Exam, Winter 2011 1. Your name and student ID. Name:... Student ID:... 2. I agree to comply with Stanford Honor Code. Signature:... 3. There should be 17 numbered pages in this exam (including

More information

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Advanced Implementations of Tables: Balanced Search Trees and Hashing Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the

More information

Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions

Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions Yufei Tao, Reynold Cheng, Xiaokui Xiao, Wang Kay Ngai, Ben Kao, Sunil Prabhakar Department of Computer Science Department

More information

Nearest Neighbor Search for Relevance Feedback

Nearest Neighbor Search for Relevance Feedback Nearest Neighbor earch for Relevance Feedbac Jelena Tešić and B.. Manjunath Electrical and Computer Engineering Department University of California, anta Barbara, CA 93-9 {jelena, manj}@ece.ucsb.edu Abstract

More information

Scheduling Multiple Trips for a Group in Spatial Databases

Scheduling Multiple Trips for a Group in Spatial Databases M.Sc. Engg. Thesis Scheduling Multiple Trips for a Group in Spatial Databases by Roksana Jahan Submitted to Department of Computer Science and Engineering in partial fulfilment of the requirements for

More information

Efficient Spatial Data Structure for Multiversion Management of Engineering Drawings

Efficient Spatial Data Structure for Multiversion Management of Engineering Drawings Efficient Structure for Multiversion Management of Engineering Drawings Yasuaki Nakamura Department of Computer and Media Technologies, Hiroshima City University Hiroshima, 731-3194, Japan and Hiroyuki

More information

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins 11 Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins Wendy OSBORN a, 1 and Saad ZAAMOUT a a Department of Mathematics and Computer Science, University of Lethbridge, Lethbridge,

More information

AN EFFICIENT INDEXING STRUCTURE FOR TRAJECTORIES IN GEOGRAPHICAL INFORMATION SYSTEMS SOWJANYA SUNKARA

AN EFFICIENT INDEXING STRUCTURE FOR TRAJECTORIES IN GEOGRAPHICAL INFORMATION SYSTEMS SOWJANYA SUNKARA AN EFFICIENT INDEXING STRUCTURE FOR TRAJECTORIES IN GEOGRAPHICAL INFORMATION SYSTEMS By SOWJANYA SUNKARA Bachelor of Science (Engineering) in Computer Science University of Madras Chennai, TamilNadu, India

More information

On multi-scale display of geometric objects

On multi-scale display of geometric objects Data & Knowledge Engineering 40 2002) 91±119 www.elsevier.com/locate/datak On multi-scale display of geometric objects Edward P.F. Chan *, Kevin K.W. Chow Department of Computer Science, University of

More information

Cost models for distance joins queries using R-trees

Cost models for distance joins queries using R-trees Data & Knowledge Engineering xxx (2005) xxx xxx www.elsevier.com/locate/datak Cost models for distance joins queries using R-trees Antonio Corral a, Yannis Manolopoulos b, *, Yannis Theodoridis c, Michael

More information

CMU Lecture 4: Informed Search. Teacher: Gianni A. Di Caro

CMU Lecture 4: Informed Search. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 4: Informed Search Teacher: Gianni A. Di Caro UNINFORMED VS. INFORMED Uninformed Can only generate successors and distinguish goals from non-goals Informed Strategies that can distinguish

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 3 We are given a set of training examples, consisting of input-output pairs (x,y), where: 1. x is an item of the type we want to evaluate. 2. y is the value of some

More information

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition Data Mining Classification: Basic Concepts and Techniques Lecture Notes for Chapter 3 by Tan, Steinbach, Karpatne, Kumar 1 Classification: Definition Given a collection of records (training set ) Each

More information

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam. Exam INFO-H-417 Database System Architecture 13 January 2014 Name: ULB Student ID: P Q1 Q2 Q3 Q4 Q5 Tot (60 (20 (20 (20 (60 (20 (200 Exam modalities You are allotted a maximum of 4 hours to complete this

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 3: Query Processing Query Processing Decomposition Localization Optimization CS 347 Notes 3 2 Decomposition Same as in centralized system

More information

Making Nearest Neighbors Easier. Restrictions on Input Algorithms for Nearest Neighbor Search: Lecture 4. Outline. Chapter XI

Making Nearest Neighbors Easier. Restrictions on Input Algorithms for Nearest Neighbor Search: Lecture 4. Outline. Chapter XI Restrictions on Input Algorithms for Nearest Neighbor Search: Lecture 4 Yury Lifshits http://yury.name Steklov Institute of Mathematics at St.Petersburg California Institute of Technology Making Nearest

More information

SEARCHING THROUGH SPATIAL RELATIONSHIPS USING THE 2DR-TREE

SEARCHING THROUGH SPATIAL RELATIONSHIPS USING THE 2DR-TREE SEARCHING THROUGH SPATIAL RELATIONSHIPS USING THE DR-TREE Wendy Osborn Department of Mathematics and Computer Science University of Lethbridge 0 University Drive W Lethbridge, Alberta, Canada email: wendy.osborn@uleth.ca

More information

K-Nearest Neighbor Temporal Aggregate Queries

K-Nearest Neighbor Temporal Aggregate Queries Experiments and Conclusion K-Nearest Neighbor Temporal Aggregate Queries Yu Sun Jianzhong Qi Yu Zheng Rui Zhang Department of Computing and Information Systems University of Melbourne Microsoft Research,

More information

GEOGRAPHY 350/550 Final Exam Fall 2005 NAME:

GEOGRAPHY 350/550 Final Exam Fall 2005 NAME: 1) A GIS data model using an array of cells to store spatial data is termed: a) Topology b) Vector c) Object d) Raster 2) Metadata a) Usually includes map projection, scale, data types and origin, resolution

More information

Continuous Skyline Queries for Moving Objects

Continuous Skyline Queries for Moving Objects Continuous Skyline Queries for Moving Objects ABSTRACT The literature on the skyline algorithms so far mainly deal with queries for static query points over static datasets. With the increasing number

More information

Extended breadth-first search algorithm in practice

Extended breadth-first search algorithm in practice Proceedings of the 9 th International Conference on Applied Informatics Eger, Hungary, January 29 February 1, 2014. Vol. 1. pp. 59 66 doi: 10.14794/ICAI.9.2014.1.59 Extended breadth-first search algorithm

More information

Topics in Approximation Algorithms Solution for Homework 3

Topics in Approximation Algorithms Solution for Homework 3 Topics in Approximation Algorithms Solution for Homework 3 Problem 1 We show that any solution {U t } can be modified to satisfy U τ L τ as follows. Suppose U τ L τ, so there is a vertex v U τ but v L

More information

Optimal Matching between Spatial Datasets under Capacity Constraints

Optimal Matching between Spatial Datasets under Capacity Constraints Optimal Matching between Spatial Datasets under Capacity Constraints LEONG HOU U 1 University of Hong Kong and KYRIAKOS MOURATIDIS Singapore Management University and MAN LUNG YIU 3 Hong Kong Polytechnic

More information

AE = q < H(p < ) + (1 q < )H(p > ) H(p) = p lg(p) (1 p) lg(1 p)

AE = q < H(p < ) + (1 q < )H(p > ) H(p) = p lg(p) (1 p) lg(1 p) 1 Decision Trees (13 pts) Data points are: Negative: (-1, 0) (2, 1) (2, -2) Positive: (0, 0) (1, 0) Construct a decision tree using the algorithm described in the notes for the data above. 1. Show the

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 14 Indexes for Multimedia Data 14 Indexes for Multimedia

More information

Introduction to Arti Intelligence

Introduction to Arti Intelligence Introduction to Arti Intelligence cial Lecture 4: Constraint satisfaction problems 1 / 48 Constraint satisfaction problems: Today Exploiting the representation of a state to accelerate search. Backtracking.

More information

Alpha-Beta Pruning: Algorithm and Analysis

Alpha-Beta Pruning: Algorithm and Analysis Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Introduction Alpha-beta pruning is the standard searching procedure used for 2-person

More information

Algorithm Design Strategies V

Algorithm Design Strategies V Algorithm Design Strategies V Joaquim Madeira Version 0.0 October 2016 U. Aveiro, October 2016 1 Overview The 0-1 Knapsack Problem Revisited The Fractional Knapsack Problem Greedy Algorithms Example Coin

More information

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 Star Joins A common structure for data mining of commercial data is the star join. For example, a chain store like Walmart keeps a fact table whose tuples each

More information

Journal of Computer and System Sciences

Journal of Computer and System Sciences Journal of Computer and System Sciences 77 (2011) 637 651 Contents lists available at ScienceDirect Journal of Computer and System Sciences www.elsevier.com/locate/jcss Voronoi-based range and continuous

More information

Venn Sampling: A Novel Prediction Technique for Moving Objects

Venn Sampling: A Novel Prediction Technique for Moving Objects Venn Sampling: A Novel Prediction Technique for Moving Objects Yufei Tao Department of Computer Science City University of Hong Kong Tat Chee Avenue, Hong Kong taoyf@cs.cityu.edu.hk Jian Zhai Department

More information

Online Sorted Range Reporting and Approximating the Mode

Online Sorted Range Reporting and Approximating the Mode Online Sorted Range Reporting and Approximating the Mode Mark Greve Progress Report Department of Computer Science Aarhus University Denmark January 4, 2010 Supervisor: Gerth Stølting Brodal Online Sorted

More information

arxiv: v2 [cs.db] 20 Jan 2014

arxiv: v2 [cs.db] 20 Jan 2014 Probabilistic Nearest Neighbor Queries on Uncertain Moving Object Trajectories Submitted for Peer Review 1.5.213 Please get the significantly revised camera-ready version under http://www.vldb.org/pvldb/vol7/p25-niedermayer.pdf

More information

SINGLE-SEQUENCE PROTEIN SECONDARY STRUCTURE PREDICTION BY NEAREST-NEIGHBOR CLASSIFICATION OF PROTEIN WORDS

SINGLE-SEQUENCE PROTEIN SECONDARY STRUCTURE PREDICTION BY NEAREST-NEIGHBOR CLASSIFICATION OF PROTEIN WORDS SINGLE-SEQUENCE PROTEIN SECONDARY STRUCTURE PREDICTION BY NEAREST-NEIGHBOR CLASSIFICATION OF PROTEIN WORDS Item type Authors Publisher Rights text; Electronic Thesis PORFIRIO, DAVID JONATHAN The University

More information

Voronoi Diagrams with a Transportation Network on the Euclidean Plane

Voronoi Diagrams with a Transportation Network on the Euclidean Plane Voronoi Diagrams with a Transportation Network on the Euclidean Plane Sang Won Bae Kyung-Yong Chwa Division of Computer Science, Department of EECS, Korea Advanced Institute of Science and Technology,

More information

A Transformation-based Framework for KNN Set Similarity Search

A Transformation-based Framework for KNN Set Similarity Search SUBMITTED TO IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 A Transformation-based Framewor for KNN Set Similarity Search Yong Zhang Member, IEEE, Jiacheng Wu, Jin Wang, Chunxiao Xing Member, IEEE

More information

Nearest Neighbor Search with Keywords: Compression

Nearest Neighbor Search with Keywords: Compression Nearest Neighbor Search with Keywords: Compression Yufei Tao KAIST June 3, 2013 In this lecture, we will continue our discussion on: Problem (Nearest Neighbor Search with Keywords) Let P be a set of points

More information

CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing

CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing Hector Garcia-Molina Zoltan Gyongyi CS 347 Notes 03 1 Query Processing! Decomposition! Localization! Optimization CS 347

More information

k-points-of-interest Low-Complexity Privacy-Preserving k-pois Search Scheme by Dividing and Aggregating POI-Table

k-points-of-interest Low-Complexity Privacy-Preserving k-pois Search Scheme by Dividing and Aggregating POI-Table Computer Security Symposium 2014 22-24 October 2014 k-points-of-interest 223-8522 3-14-1 utsunomiya@sasase.ics.keio.ac.jp POIs Points of Interest Lien POI POI POI POI Low-Complexity Privacy-Preserving

More information

SYSTEMS for continuous monitoring or tracking of mobile

SYSTEMS for continuous monitoring or tracking of mobile 1112 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 16, NO. 9, SEPTEMBER 2004 Querying Imprecise Data in Moving Object Environments Reynold Cheng, Student Member, IEEE, Dmitri V. Kalashnikov,

More information

Informed Search. Day 3 of Search. Chap. 4, Russel & Norvig. Material in part from

Informed Search. Day 3 of Search. Chap. 4, Russel & Norvig. Material in part from Informed Search Day 3 of Search Chap. 4, Russel & Norvig Material in part from http://www.cs.cmu.edu/~awm/tutorials Uninformed Search Complexity N = Total number of states B = Average number of successors

More information

Slides based on those in:

Slides based on those in: Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering

More information

Faster Algorithms for some Optimization Problems on Collinear Points

Faster Algorithms for some Optimization Problems on Collinear Points Faster Algorithms for some Optimization Problems on Collinear Points Ahmad Biniaz Prosenjit Bose Paz Carmi Anil Maheshwari J. Ian Munro Michiel Smid June 29, 2018 Abstract We propose faster algorithms

More information

An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets

An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets IEEE Big Data 2015 Big Data in Geosciences Workshop An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets Fatih Akdag and Christoph F. Eick Department of Computer

More information

Learning from Examples

Learning from Examples Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble

More information

CS360 Homework 12 Solution

CS360 Homework 12 Solution CS360 Homework 12 Solution Constraint Satisfaction 1) Consider the following constraint satisfaction problem with variables x, y and z, each with domain {1, 2, 3}, and constraints C 1 and C 2, defined

More information

Approximate Voronoi Diagrams

Approximate Voronoi Diagrams CS468, Mon. Oct. 30 th, 2006 Approximate Voronoi Diagrams Presentation by Maks Ovsjanikov S. Har-Peled s notes, Chapters 6 and 7 1-1 Outline Preliminaries Problem Statement ANN using PLEB } Bounds and

More information

Probabilistic Range Queries for Uncertain Trajectories on Road Networks

Probabilistic Range Queries for Uncertain Trajectories on Road Networks Probabilistic Range Queries for Uncertain Trajectories on Road Networks Kai Zheng 1 Goce Trajcevski 2 Xiaofang Zhou 1,3 Peter Scheuermann 2 1 School of ITEE, The University of Queensland, Australia 2 Department

More information

Decision Tree Learning

Decision Tree Learning Decision Tree Learning Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Machine Learning, Chapter 3 2. Data Mining: Concepts, Models,

More information

Alpha-Beta Pruning: Algorithm and Analysis

Alpha-Beta Pruning: Algorithm and Analysis Alpha-Beta Pruning: Algorithm and Analysis Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Introduction Alpha-beta pruning is the standard searching procedure used for solving

More information

Uncertain Distance-Based Range Queries over Uncertain Moving Objects

Uncertain Distance-Based Range Queries over Uncertain Moving Objects Chen YF, Qin XL, Liu L. Uncertain distance-based range queries over uncertain moving objects. JOURNAL OF COM- PUTER SCIENCE AND TECHNOLOGY 25(5): 982 998 Sept. 2010. DOI 10.1007/s11390-010-1078-3 Uncertain

More information

Informed Search. Chap. 4. Breadth First. O(Min(N,B L )) BFS. Search. have same cost BIBFS. Bi- Direction. O(Min(N,2B L/2 )) BFS. have same cost UCS

Informed Search. Chap. 4. Breadth First. O(Min(N,B L )) BFS. Search. have same cost BIBFS. Bi- Direction. O(Min(N,2B L/2 )) BFS. have same cost UCS Informed Search Chap. 4 Material in part from http://www.cs.cmu.edu/~awm/tutorials Uninformed Search Complexity N = Total number of states B = Average number of successors (branching factor) L = Length

More information

Constraint satisfaction search. Combinatorial optimization search.

Constraint satisfaction search. Combinatorial optimization search. CS 1571 Introduction to AI Lecture 8 Constraint satisfaction search. Combinatorial optimization search. Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Constraint satisfaction problem (CSP) Objective:

More information

The Truncated Tornado in TMBB: A Spatiotemporal Uncertainty Model for Moving Objects

The Truncated Tornado in TMBB: A Spatiotemporal Uncertainty Model for Moving Objects The : A Spatiotemporal Uncertainty Model for Moving Objects Shayma Alkobaisi 1, Petr Vojtěchovský 2, Wan D. Bae 3, Seon Ho Kim 4, Scott T. Leutenegger 4 1 College of Information Technology, UAE University,

More information

An Interactive Framework for Spatial Joins: A Statistical Approach to Data Analysis in GIS

An Interactive Framework for Spatial Joins: A Statistical Approach to Data Analysis in GIS An Interactive Framework for Spatial Joins: A Statistical Approach to Data Analysis in GIS Shayma Alkobaisi Faculty of Information Technology United Arab Emirates University shayma.alkobaisi@uaeu.ac.ae

More information

Finding optimal configurations ( combinatorial optimization)

Finding optimal configurations ( combinatorial optimization) CS 1571 Introduction to AI Lecture 10 Finding optimal configurations ( combinatorial optimization) Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Constraint satisfaction problem (CSP) Constraint

More information

HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence

HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence Eamonn Keogh *Jessica Lin Ada Fu University of California, Riverside George Mason Univ Chinese Univ of Hong Kong The 5 th IEEE International

More information

CS246 Final Exam. March 16, :30AM - 11:30AM

CS246 Final Exam. March 16, :30AM - 11:30AM CS246 Final Exam March 16, 2016 8:30AM - 11:30AM Name : SUID : I acknowledge and accept the Stanford Honor Code. I have neither given nor received unpermitted help on this examination. (signed) Directions

More information

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1)

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1) Correlated subqueries Query Optimization CPS Advanced Database Systems SELECT CID FROM Course Executing correlated subquery is expensive The subquery is evaluated once for every CPS course Decorrelate!

More information

Nearest Neighbor Searching Under Uncertainty. Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University

Nearest Neighbor Searching Under Uncertainty. Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University Nearest Neighbor Searching Under Uncertainty Wuzhou Zhang Supervised by Pankaj K. Agarwal Department of Computer Science Duke University Nearest Neighbor Searching (NNS) S: a set of n points in R. q: any

More information

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification

More information