Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Similar documents
Multimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data

5 Spatial Access Methods

5 Spatial Access Methods

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. SAMs - Detailed outline. Spatial Access Methods - problem

AN EFFICIENT INDEXING STRUCTURE FOR TRAJECTORIES IN GEOGRAPHICAL INFORMATION SYSTEMS SOWJANYA SUNKARA

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

Nearest Neighbor Search with Keywords

Efficient Spatial Data Structure for Multiversion Management of Engineering Drawings

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM

CS 273 Prof. Serafim Batzoglou Prof. Jean-Claude Latombe Spring Lecture 12 : Energy maintenance (1) Lecturer: Prof. J.C.

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Skylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland

LAYERED R-TREE: AN INDEX STRUCTURE FOR THREE DIMENSIONAL POINTS AND LINES

Multimedia Databases. Previous Lecture. 4.1 Multiresolution Analysis. 4 Shape-based Features. 4.1 Multiresolution Analysis

Making Nearest Neighbors Easier. Restrictions on Input Algorithms for Nearest Neighbor Search: Lecture 4. Outline. Chapter XI

Tailored Bregman Ball Trees for Effective Nearest Neighbors

Nearest Neighbor Search with Keywords in Spatial Databases

Data Warehousing & Data Mining

Lecture 2: A Las Vegas Algorithm for finding the closest pair of points in the plane

SEARCHING THROUGH SPATIAL RELATIONSHIPS USING THE 2DR-TREE

Multimedia Databases. 4 Shape-based Features. 4.1 Multiresolution Analysis. 4.1 Multiresolution Analysis. 4.1 Multiresolution Analysis

Transforming Hierarchical Trees on Metric Spaces

Approximate Voronoi Diagrams

Multimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig

Data Structures and Algorithms " Search Trees!!

Location-Based R-Tree and Grid-Index for Nearest Neighbor Query Processing

CS Data Structures and Algorithm Analysis

Binary Search Trees. Motivation

SPATIAL INDEXING. Vaibhav Bajpai

Multimedia Databases. Previous Lecture Video Abstraction Video Abstraction Example 6/20/2013

Chapter 6: Classification

12 Review and Outlook

HW #4. (mostly by) Salim Sarımurat. 1) Insert 6 2) Insert 8 3) Insert 30. 4) Insert S a.

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014

Spatial Database. Ahmad Alhilal, Dimitris Tsaras

Lecture 10 September 27, 2016

Content-based Publish/Subscribe using Distributed R-trees

CS 6375 Machine Learning

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

On multi-scale display of geometric objects

Search and Lookahead. Bernhard Nebel, Julien Hué, and Stefan Wölfl. June 4/6, 2012

Compressed Index for Dynamic Text

Course 212: Academic Year Section 1: Metric Spaces

Each internal node v with d(v) children stores d 1 keys. k i 1 < key in i-th sub-tree k i, where we use k 0 = and k d =.

Lecture 4: Divide and Conquer: van Emde Boas Trees

Space-Time Tradeoffs for Approximate Spherical Range Counting

Collision. Kuan-Yu Chen ( 陳冠宇 ) TR-212, NTUST

Mid Term-1 : Practice problems

Holdout and Cross-Validation Methods Overfitting Avoidance

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins

AI Programming CS S-06 Heuristic Search

Binary Search Trees. Lecture 29 Section Robb T. Koether. Hampden-Sydney College. Fri, Apr 8, 2016

Data Warehousing & Data Mining

Multimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig

AVL Trees. Manolis Koubarakis. Data Structures and Programming Techniques

An efficient 3D R-tree spatial index method for virtual geographic environments

Probabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition

Data Warehousing & Data Mining

Succinct Data Structures for Approximating Convex Functions with Applications

Pavel Zezula, Giuseppe Amato,

Multimedia Databases Video Abstraction Video Abstraction Example Example 1/8/ Video Retrieval Shot Detection

Finite Metric Spaces & Their Embeddings: Introduction and Basic Tools

Algorithms for Nearest Neighbors

Dictionary: an abstract data type

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.

Database Theory VU , SS Complexity of Query Evaluation. Reinhard Pichler

Exact and Approximate Flexible Aggregate Similarity Search

Data Warehousing. Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Cost models for distance joins queries using R-trees

Composite Quantization for Approximate Nearest Neighbor Search

Execution time analysis of a top-down R-tree construction algorithm

CS145: INTRODUCTION TO DATA MINING

arxiv: v2 [cs.ds] 3 Oct 2017

Erik G. Hoel. Hanan Samet. Center for Automation Research. University of Maryland. November 7, Abstract

An Algorithmist s Toolkit Nov. 10, Lecture 17

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

Distribution-specific analysis of nearest neighbor search and classification

Decision Procedures An Algorithmic Point of View

RESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE

More Dynamic Programming

Databases. DBMS Architecture: Hashing Techniques (RDBMS) and Inverted Indexes (IR)

More Dynamic Programming

arxiv: v1 [cs.cg] 5 Sep 2018

The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization

Evolutionary Tree Analysis. Overview

Evaluation of Probabilistic Queries over Imprecise Data in Constantly-Evolving Environments

Nearest Neighbor Search with Keywords: Compression

Indexing Objects Moving on Fixed Networks

A Single-Exponential Fixed-Parameter Algorithm for Distance-Hereditary Vertex Deletion

Modern Information Retrieval

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Decision Trees. Tobias Scheffer

Data Warehousing & Data Mining

CS 347 Parallel and Distributed Data Processing

IE418 Integer Programming

Similarity searching, or how to find your neighbors efficiently

Final Exam, Machine Learning, Spring 2009

Alpha-Beta Pruning: Algorithm and Analysis

Transcription:

Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

13 Indexes for Multimedia Data 13 Indexes for Multimedia Data 13.1 R-Trees 13.2 M-Trees Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 2

13.0 Indexes for Multimedia Data Multimedia databases Images Audio data Video data Description of multimedia objects Usually (multidimensional) real-valued feature vectors But also: skeletons, chain codes,... The sequential search for similar objects in databases is very inefficient How can we speed up the search? Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 3

13.0 Indexes for Multimedia Data Speed up search through indexing Efficient management of multidimensional information Pre-structuring of data for the subsequent search functionality Efficient data structures, combined with search and comparison algorithms Transition from set semantics to list semantics To which degree does the object from the database satisfy the query? Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 4

13.0 Indexes for Multimedia Data Requirements for a multidimensional index structure Correctness and completeness of the corresponding indexing algorithms Scalability with dimension growth Support objects which are not real-valued vectors Search efficiency (sublinear) Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 5

13.0 Indexes for Multimedia Data Different types of queries: Exact search Area search Nearest neighbors search... Efficient update operations Support for various distance functions Low memory requirements Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 6

13.0 Indexes for Multimedia Data Fundamental problem: The more dimensions, the more comparisons are needed There is currently no truly scalable indexing Cause: Curse of Dimensionality (Richard Bellman) The volume of space grows exponentially with the number of its dimensions Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 7

13.0 Query Types Exact search Point search Area search k-nearest-neighbor search (k - NN-search) Find the k objects that have the least distance to the object given as reference in the request k-nn search is usually only calculated on approximation basis (with a specified error) due to the high cost Reverse-nearest-neighbor search Find all the objects whose nearest neighbor is provided in the query Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 8

13.0 Tree Structures Search in database systems B-tree structures allow exact search with logarithmic costs 2 6 7 1 3 4 5 8 9 1 2 3 4 5 6 7 8 9 Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 9

13.0 Tree Structures Search in multimedia databases The data is multidimensional, B-trees however, support only one-dimensional search Are there any possibilities to extend tree functionality for multidimensional data? Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 10

13.0 Tree Structures The basic idea of multidimensional trees Describe the sets of points through geometric regions, which comprise the points (clusters) The clusters are considered for the actual search and not the individual points Clusters can contain each other, resulting in a hierarchical structure Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 11

13.0 Tree Structures Differentiating criterias for tree structures: Cluster construction: Completely fragmenting the space or Grouping data locally Cluster overlap: Overlapping or Disjoint Balance: Balanced or Unbalanced Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 12

13.0 Tree Structures Object storage: Objects in leaves and nodes, or Objects only in the leaves Geometry: Hyper-spheres, Hyper-cube,... Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 13

13.1 R-Trees The R-tree (Guttman, 1984) is the prototype of a multi-dimensional extension of the classical B-trees Frequently used for low-dimensional applications (used to about 10 dimensions), such as geographic information systems More scalable versions: R + -Trees, R*-Trees and X- Trees (each up to 20 dimensions for uniform distributed data) Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 14

13.1 R-Tree Structure Dynamic Index Structure (insert, update and delete are possible) Data structure Data pages are leaf nodes and store clustered point data and data objects Directory pages are the internal nodes and store directory entries Multidimensional data are structured with the help of Minimum Bounding Rectangles (MBRs) Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 15

13.1 R-Tree Example R1 R4 R5 R6 R3 R10 R11 root R1 R2 R3 R2 R7 X Q R4 R5 R6 R7 R8 R9 R10 R11 R8 X O X p R9 root Q P O Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 16

13.1 R-Tree Characteristics Local grouping for clustering Overlapping clusters (the more the clusters overlap the more inefficient is the index) Height balanced tree structure (therefore all the children of a node in the tree have about the same number of successors) Objects are stored, only in the leaves Internal nodes are used for navigation MBRs are used as a geometry Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 17

13.1 R-Tree Properties The root has at least two children Each internal node has between m and M children M and m M / 2 are pre-defined parameters For each entry (I, child-pointer) in an internal node, I is the smallest rectangle that contains the rectangles of the child nodes Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 18

13.1 R-Tree Properties For each index entry (I, tuple-id) in a leaf, I is the smallest bounding rectangle that contains the data object (with the ID tuple-id) All the leaves in the tree are on the same level All leaves have between m and M index records Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 19

13.1 Operations of R-Trees The essential operations for the use and management of an R-tree are Search Insert Updates Delete Splitting Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 20

13.1 Searching in R-Trees The tree is searched recursively from the root to the leaves One path is selected If the requested record has not been found in that sub-tree, the next path is traversed The path selection is arbitrary Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 21

13.1 Searching in R-Trees No guarantee for good performance In the worst case, all paths must traversed (due to overlaps of the MBRs) Search algorithms try to exclude as many irrelevant regions as possible ( pruning ) Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 22

13.1 Search Algorithm All the index entries which intersect with the search rectangle S are traversed The search in internal nodes Check each object for intersection with S For all intersecting entries continue the search in their children The search in leaf nodes Check all the entries to determine whether they intersect S Take all the correct objects in the result set Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 23

13.1 Example R1 R5 R3 R11 root R4 R6 R2 R7 R10 X R1 R2 R3 X R7 R8 R9 R8 S R9 root Check all the objects in node R8 Check only 7 nodes instead of 12 Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 24

13.1 Insert Procedure The best leaf page is chosen (ChooseLeaf) considering the spatial criteria Beast leaf: the leaf that needs the smallest volume growth to include the new object The object will be inserted there if there is enough room (number of objects in the node < M) Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 25

13.1 Insert If there is no more place left in the node, it is considered a case for overflow and the node is divided (SplitNode) Goal of the split is to result in minimal overlap and as small dead space as possible Interval of the parent node must be adapted to the new object (AdjustTree) If the root is reached by division, then create a new root whose children are the two split nodes of the old root Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 26

13.1 R-Tree Insert Example R1 R5 R2 R3 R11 R2 R7 R2 R7 R4 R6 R10 R8 x P R9 R8 x P R9 R2 R8 R7 x P R9 root Inserting P either in R7 or R9 In R7, it needs more space, but does not overlap Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 27

13.1 Heuristics An object is always inserted in the nodes, to which it produces the smallest increase in volume If it falls in the interior of a MBR no enlargement is need If there are several possible nodes, then select the one with the smallest volume Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 28

13.1 Insert with Overflow R1 R5 R3 R11 R2 R7 R7b R4 R6 R10 R8 R9 X P R2 R7 X P root R8 R9 root R1 R2 R3 R4 R5 R6 R7 R7b R8 R9 R10 R11 Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 29

13.1 SplitNode If an object is inserted in a full node, then the M+1 objects will be divided among two new nodes The goal in splitting is that it should rarely be needed to traverse both resulting nodes on subsequent searches Therefore use small MBRs. This leads to minimal overlapping with other MBRs Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 30

13.1 Split Example Calculate the minimum total area of two rectangles, and minimize the dead space Bad split Better Split Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 31

13.1 Overflow Problem Deciding on how exactly to perform the splits is not trivial All objects of the old MBR can be divided in different ways on two new MBRs The volume of both resulting MBRs should remain as small as possible The naive approach of checking checks all splits and calculate the resulting volumes is not possible Two approaches With quadratic cost With linear cost Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 32

13.1 Overflow Problem Procedure with quadratic cost Compute for each 2 objects the necessary MBR and choose the pair with the largest MBR Since these two objects should not occur in an MBR, they will be used as starting points for two new MBRs Compute for all other objects, the difference of the necessary volume increase with respect to both MBRs Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 33

13.1 Overflow Problem Insert the object with the smallest difference in the corresponding MBR and compute the MBR again Repeat this procedure for all unallocated objects Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 34

13.1 Overflow Problem Procedure with linear cost In each dimension: Find the rectangle with the highest minimum coordinates, and the rectangle with the smallest maximum coordinates Determine the distance between these two coordinates, and normalize it on the size of all the rectangles in this dimension Determine the two starting points of the new MBRs as the two objects with the highest normalized distance Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 35

13.1 Example 5 A D B E 8 13 C 14 x-direction: select A and E, as d x = diff x /max x = 5 / 14 y-direction: select C and D, as d y = diff y /max y = 8 / 13 Since d x < d y, C and D are chosen for the split Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 36

13.1 Overflow Problem Classify all remaining objects the MBR with the smallest volume growth The linear process is a simplification of the quadratic method It is usually sufficient providing similar quality of the split (minimal overlap of the resulting MBRs) Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 37

13.1 Delete Procedure Search the leaf node with the object to delete (FindLeaf) Delete the object (deleterecord) The tree is condensed (CondenseTree) if the resulting node has < m objects When condensing, a node is completely erased and the objects of the node which should have remained are reinserted If the root remains with just one child, the child will become the new root Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 38

13.1 Example An object from R9 is deleted (1 object remains in R9, but m = 2) Due to few objects R9 is deleted, and R2 is reduced (condensetree) R2 R7 R2 R7 root R8 R9 R8 R1 R2 R3 R4 R5 R6 R7 R8 R10 R11 Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 39

13.1 Update If a record is updated, its surrounding rectangle can change The index entry must then be deleted updated and then re-inserted Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 40

13.1 Block Access Cost The most efficient search in R-trees is performed when the overlap and the dead space are minimal root A D C N F M E E L G S H K J I B A B C D E F G H I J K L M N Avoiding overlapping is only possible if data points are known in advance Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 41

13.1 Improved Versions of R-Trees Where are R-trees inefficient? They allow overlapping between neighboring MBRs R + -Trees (Sellis ua, 1987) Overlapping of neighboring MBRs are prohibited This may lead to identical leafs occurring more than once in the tree Improve search efficiency, but similar scalability as R-trees Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 42

13.1 R + -Trees A D F E P G S H K J I B root A B C P C M N L D E F G I J K L M N G H Overlaps are not permitted (A and P) Data rectangles are divided and may be present (e.g., G) in several leafs Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 43

13.1 Operations in R + -Trees Differences to the R-tree Insert Data object can be inserted into several leafs Splitting continues downwards, since no overlaps are allowed Delete There is no more minimum number of children Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 44

13.1 Performance The main advantage of R + -trees is to improve the search performance Especially for point queries, this saves 50% of access time Drawback is the low occupancy of nodes resulting through many splits R + -trees often degenerate with the increasing number of changes Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 45

13.1 More Versions R*- trees and X-trees improve the performance of the R + -trees (Kriegel and others, 1990/1996) Improved split algorithm in R*-trees Extended nodes in X-trees allow sequential search of larger objects Scalable up to 20 dimensions Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 46

13.2 M-Trees M-tree (Ciaccia et al, 1997) allows the use of arbitrary metrics for comparison of objects ( metric trees ) R-trees only work with Euclidean metrics, but what about for example, the editing distance? Use the triangle inequality to check sub-trees Geometry is determined by the distance function Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 47

13.2 Metric Space A metric space is a pair of M = (U, d) U is the universe of all possible values d is a metric For all x, y, z U: d(x, y) 0, d(x, y) = 0 iff. x = y d(x, y) = d(y, x) d(x, y) d(x, z) + d(z, y) (triangle inequality) Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 48

13.2 Triangle Inequality Precomputed: Distances for all pairs of points Task: Find the object with the smallest distance to Q Distance between Q and a is 2 Distance between Q and b is 7.81 Can C be the best object? d (Q, b) d (Q, c) + d ( b, c) 5,51 d (Q, c) No. Therefore a is better Q a b a b c a 6.70 7.07 b 2.30 c c Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 49

13.2 Partitioning The M-tree partitions the objects in ε-environments with certain radius A balanced partitioning is obtained by choosing P1 = { p d(p, v) r v } and P2 = { p d(p, v) > r v } so that P1 P2 P / 2 For a query q with d(q, x) < r only P2 must be considered Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 50

13.2 M-Trees M-trees are similar to R-trees, but use the distance information Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 51

13.2 M-Trees Each node N has a region Reg(N) Reg Reg(N) = {p p U, d(p, v N ) r N } With v N as so called routing object and r N as the radius of the area ( covering radius ) All the indexed points p have guaranteed distance of at most r N from v N Queries qwith d(q, v N ) > r N + r don t need to consider node N Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 52

13.2 M-Trees Internal nodes have A routing object The radius of their region and A distance to the parent node Leaf nodes have The values of the indexed objects and Their distance from the parent node Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 53

13.2 M-Trees Precomputed distances to the respective parent nodes allow fast searching ( fast pruning ) d(v P, v N ) is precomputed. We don t need d(q, v N ) if d(q, v P ) d(v P, v N ) > r N + r P P N N Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 54

13.2 M-Trees Insert is performed as by R-trees with the smallest expansion of the region radius At overflow, a split is performed No volumes are however calculated (as in MBRs in the R-tree) Delete the node and choose two new routing objects Heuristic: Minimize the maximum of the two resulting region radiuses Attribute then the routing objects to the new regions alternating between their nearest neighbors (Balanced Split) Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 55

13.2 M-Trees M-Trees overview Allow a variety of distance functions Use triangle inequality for pruning The dimensionality is also very limited Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 56

Next lecture Indexes for Multimedia Data Curse of Dimensionality Dimension Reduction GEMINI Indexing Multimedia Databases Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 57