Theoretically Optimal and Empirically Efficient R-trees with Strong Parallelizability

Size: px
Start display at page:

Download "Theoretically Optimal and Empirically Efficient R-trees with Strong Parallelizability"

Transcription

1 Theoretically Otimal and Emirically Efficient R-trees with Strong Parallelizability Jianzhong Qi, Yufei Tao, Yanchuan Chang, Rui Zhang School of Comuting and Information Systems, The University of Melbourne Deartment of Comuter Science and Engineering, Chinese University of Hong Kong ABSTRACT The massive amount of data and large variety of data distributions in the big data era call for access methods that are efficient in both query rocessing and index bulk-loading, and over both ractical and worst-case workloads. To address this need, we revisit a classic multidimensional access method the R-tree. We roose a novel R-tree acking strategy that roduces R-trees with an asymtotically otimal I/O comlexity for window queries in the worst case. Our exeriments show that the R-trees roduced by the roosed strategy are highly efficient on real and synthetic data of different distributions. The roosed strategy is also simle to arallelize, since it relies only on sorting. We roose a arallel algorithm for R-tree bulk-loading based on the roosed acking strategy, and analyze its erformance under the massively arallel communication model. Exerimental results confirm the efficiency and scalability of the arallel algorithm over large data sets. PVLDB Reference Format: Jianzhong Qi, Yufei Tao, Yanchuan Chang, and Rui Zhang. Theoretically Otimal and Emirically Efficient R-trees with Strong Parallelizability. PVLDB, (): -, 0. DOI: htts://doi.org/0./.. INTRODUCTION Satial databases have been traditionally used in geograhic information systems, comuter-aided-design, multimedia data management, and medical studies. They are becoming ubiquitous with the roliferation of location-based services such as digital maing, augmented reality gaming, geosocial networking, and targeted advertising. For examle, in maing services such as Google Mas, the search this area functionality suorts querying laces of interest (POIs) such as shos within a given view area (cf. Fig. a). In a oular augmented reality game, Pokémon GO [], every layer has an avatar laced in the game ma based on the layer s geograhical location. The layers can interact Permission to make digital or hard coies of all or art of this work for ersonal or classroom use is granted without fee rovided that coies are not made or distributed for rofit or commercial advantage and that coies bear this notice and the full citation on the first age. To coy otherwise, to reublish, to ost on servers or to redistribute to lists, requires rior secific ermission and/or a fee. Articles from this volume were invited to resent their results at The th International Conference on Very Large Data Bases, August 0, Rio de Janeiro, Brazil. Proceedings of the VLDB Endowment, Vol., No. Coyright 0 VLDB Endowment 0-09//0... $ DOI: htts://doi.org/0./. (a) Shos (blue and red dots) in a ma view (b) Gaming objects in a game view Figure : Window queries in real alications with gaming objects (e.g., okémons ) in the game view through their avatars (cf. Fig. b). Managing POIs or gaming objects in a view which usually has a rectangular window shae is a tyical alication of satial databases. In these alications, there may be hundreds of millions of satial objects (e.g., shos, restaurants, okémons, etc.) with a variety of distributions to be managed. Meanwhile, there may be millions of service requests from users, e.g., Google Mas is answering millions of queries er day [], and Pokémon GO is attracting over 0 million active users []. Reorting POIs or okémons in a given area in real time under such settings oses significant challenges. Satial indices are imortant techniques to address such challenges. They offer fast retrieval of satial objects. We revisit a classic satial index the R-tree [9]. Our aim is to achieve an R-tree structure that is efficient in both window query rocessing and tree bulk-loading, and over both ractical and worst-case workloads. R-trees have attracted extensive research interests [,,, 0,, ] and industrial alications [, ]. An R-tree is a balanced

2 tree structure designed for external memory based satial object indexing. Every node in an R-tree may contain multile entries. In the leaf nodes, the entries are minimum bounding rectangles (MBR) of the data objects (and ointers to them); in the inner nodes, the entries are MBRs of and ointers to the child nodes. An R-tree node usually corresonds to a disk block, the size of which constrains the caacity of the node, i.e., the maximum number of entries er node, denoted as B. Given an R-tree, a window query returns all the data objects (e.g., POIs or okémons) indexed in the tree that are within or intersect a given query window which is usually a rectangular region of interest. R-trees have good query efficiency in ractice when they are constructed with carefully crafted heuristics [,,, ]. However, all these heuristics suffer from the limitation that they do not roduce an R-tree with attractive erformance guarantees in the worst case. The Priority R-tree (PR-tree) [] is an R-tree structure that has a theoretical query erformance guarantee. It answers a window query with O((n/B) /d + k/b) I/Os in the worst case, which is known to be asymtotically otimal []. Here, n, d, and k denote the data set size, the dimensionality, and the outut size, i.e., the number of objects satisfying the query, resectively. The PR-tree is designed for rectangles. As a followu study shows [0], the PR-tree may not have satisfactory emirical erformance on data objects of a small size (e.g., oint data) or queries with small query windows; it also suffers in bulk-loading erformance; and the tree construction is difficult to arallelize. We re-examine the construction of R-trees and aim for high window query efficiency over oint data, which is a common way for reresenting locations on digital mas. Satial objects with extents can also be efficiently transformed into oints for query rocessing, as a recent study [] shows. We target alication scenarios such as digital maing, where the data sets are relatively static and can be batchudated. We construct R-trees that are query cost otimal for static data. Our R-trees handle dynamic data udates efficiently as well, although the otimality may be comromised. We also offer efficient techniques for eriodical R-tree rebuilding to reserve the query cost otimality. We roose an R-tree acking strategy that creates R- trees with the worst-case otimal window query I/O cost O((n/B) /d + k/b). This strategy has a simle rocedure and the R-trees roduced have high ractical query efficiency. A key ste we take before acking the data oints is to ma them into a rank sace such that their coordinates are maed to their ranks in each dimension. Ties in one dimension are broken by the coordinates in the other dimension. As a result, we obtain data oints with no reetitive coordinates in either dimension. We then simly ack every B data oints into a leaf node (excet ossibly the last leaf node) of an R-tree in the ascending order of the Z-order values of the data oints in the rank sace. The Z-order is an ordering created by the Z-curve [] which is a common tye of sace-filling curves (SFC). The inner nodes of the R-tree are created by acking every B child nodes into a arent node (excet ossibly the last node in each level) again in the ascending order of the Z-cure values and recursively from the bottom to the to of the tree. An inner node entry stores a ointer to a child node and its MBR. Since our R-tree acking strategy relies only on sorting, it takes O((n/B) log M/B (n/b)) I/Os to bulk-load an R-tree, where M is the size of the memory. A key advantage of this strategy is that it is highly arallelizable, which is an imortant feature in the big data era. Bulk-loading an R-tree with this strategy well suits the oular massively arallel communication (MPC) model [], which aves the foundation for designing algorithms for MaReduce systems []. We roose a arallel bulk-loading algorithm that takes O(log s n) rounds of comutation, where s = n/g and g is the number of machines articiating in the arallel algorithm. The (arallel) running time of our algorithm is O((n log n)/g), while the total time (summed over all machines) is O(n log n). For modern machines, s is large, e.g., at the order of millions, allowing the roosed algorithm to bulk-load an R-tree with a very large number of data oints in just a few rounds of comutation. While the rank sace has been used by the comutational geometry community to develo theoretical bounds [, ], we observe for the first time that rank-sace conversion can be leveraged to build a worst-case otimal structure for window queries. Furthermore, it is erhas surrising that we are able to achieve the urose by combining the rank sace with an SFC, because SFC-based indexes were reviously thought to have oor worst-case query costs. Indeed, as shown in [], if an SFC is used directly (i.e., in the original data sace) for indexing, there exist window queries which retrieve few oints, but have I/O costs linear to the data set size. In fact, even analyzing the query cost of an SFC-based index is non-trivial. The limited literature on this toic [, 9, ] has focused on the average query cost, which is analyzed indirectly by studying the clustering behavior of SFCs. In summary, this aer makes the following contributions:. We roose the first SFC-based acking strategy that creates R-trees with a worst-case otimal window query I/O cost.. The roosed acking strategy suggests a simle R- tree bulk-loading algorithm that relies only on sorting. We roose such an algorithm under the massively arallel communication model (and thus, works on MaReduce systems) with attractive erformance guarantees.. We erform extensive exeriments on both real and synthetic data. The results confirm the sueriority of the roosed R-tree acking strategy: on real data, the query I/O cost of the R-trees that we construct is u to % lower than that of PR-trees [] and similar to that of STR-trees [], which are a classic tye of sorting based bulk-loaded R-trees; on highly skewed synthetic data, the query I/O cost of the R-trees that we construct is % lower than that of PR-trees and 0% lower than that of STR-trees. The roosed bulkloading algorithm also outerforms the PR-tree bulkloading algorithm in running time by % over large data sets with 0 million data oints. The rest of the aer is organized as follows: Section discusses related work. Section details the roosed R-tree acking strategy and the worst-case window query I/O costs. Section describes the roosed arallel R-tree bulk-loading algorithm. Section discusses data udate handling. Section studies the emirical erformance of the roosed algorithms. Section concludes the aer.

3 . RELATED WORK We review studies on satial queries and access methods, with a focus on R-trees. Satial queries and access methods. We focus on the window query (rectangular range query) which is a basic tye of satial queries []. A window query returns all data objects that satisfy a certain redicate with a given query window, i.e., a (hyer)rectangular region of interest. Common query redicates include containment and intersection, which require the data objects to be fully contained in or intersect the query window, resectively. A straightforward window query algorithm sequentially checks every data object and returns an object if it satisfies the query redicate. This algorithm takes O(n/B) I/Os regardless of data distribution and outut size. Satial indices have been used to obtain higher query efficiency. We focus on the R-tree index [9]. For a comrehensive review on satial indices, interested readers are referred to []. R-trees. As discussed earlier, the R-tree is a balanced tree structure. The maximum number of entries er tree node (node caacity) B is constrained by the block size, while the minimum number of entries er tree node (excet the root node) is Ω(B). The root node needs to contain at least entries unless it is also a leaf node. Thus, the height of an R-tree indexing n objects is bounded by O(log B n). A window query is rocessed by a to-down traversal over the nodes of an R-tree whose MBRs satisfy the query. When the leaf nodes are reached, data objects in them satisfying the query are returned. A series of studies [, 9,, 0, ] roose heuristics to otimize the node MBRs during dynamic data insertion. The R*-tree [], for examle, considers the MBR overlas and region erimeters to decide the node into which a new object should be inserted. R-tree acking and bulk-loading. A different stream of research considers how to construct an R-tree by acking data objects into the leaf nodes directly rather than inserting them individually. The entire R-tree is bulk-loaded in a bottom-u fashion. Most R-tree acking algorithms [, 0,,, ] rely on some ordering of the data objects and hence have an I/O cost of O((n/B) log M/B (n/b)), which is the cost for sorting n objects (recall that M is the number of objects allowed in the main memory). Secifically, Roussooulos and Leifker [] sort the data objects by their x-coordinates and ack every B objects into a leaf node. Leutenegger et al. [] first sort the data objects by their x- coordinates, and then artition the data into n/b subsets. Objects in each subset are sorted by their y-coordinates and acked into the leaf nodes. Other studies use the Hilbert ordering [, 0, ]. R-trees bulk-loaded based on Hilbert ordering have good window query erformance on nicely distributed data [0]. However, these R-trees do not have worst-case otimal query erformance. There are also to-down bulk-loading algorithms, e.g., To-down Greedy Slit (TGS) []. TGS artitions the data set into two subsets reeatedly until B aroximately equisized subsets have been obtained. The MBRs of these B subsets form entries of the root. Each artition uses a cut orthogonal to an axis that yields two subsets with the minimum sum of costs, where the cost is based on a user-defined function, e.g., the area of the MBR of a subset. There are O(B) candidate cuts, where the hidden constant lies in the different cuts in different dimensions and on different orderings (e.g., lower x corner, center, etc.). In each dimension and with a articular ordering, the i th cut uts i n/b objects in one subset and the rest in the other subset. TGS has been shown to roduce R-trees with good query efficiency, but it has a high worst-case I/O cost, O(n log B n), for R-tree construction. This is because it needs to scan the data set B times to create the B artitions of a node (assuming that the orderings used for artitioning have been recomuted). If viewed from a recursive binary artition ersective, the I/O cost of TGS is effectively O((n/B) log n) []. Agarwal et al. [] roose an algorithm to bulk-load a Box-tree, which can be converted to an R-tree that obtains a worst-case query I/O cost of O((n/B) /d +k log B n). This work is more of theoretical interest. No imlementation or exerimental results have been given for the algorithm. The PR-tree [] is an R-tree that offers a worst-case window query I/O cost of O((n/B) /d + k/b), which is asymtotically otimal []. A PR-tree is created from a seudo-pr-tree, which is an unbalanced tree built in a todown fashion. To create a seudo-pr-tree, the data set is artitioned into six artitions to form the child nodes of the root. Four of the artitions contain B objects each, which are objects with the smallest lower x-coordinates, the smallest lower y-coordinates, the largest uer x-coordinates, and the largest uer y-coordinates, resectively. The remaining two artitions are two equisized artitions of the remaining objects, which are then recursively artitioned to form subtrees of the root. When a seudo-pr-tree is created, its leaf nodes are used as the leaf nodes of a PR-tree. The MBRs of the leaf nodes are used to create another seudo-pr-tree, the leaf nodes of which are used as the arent nodes of the leaf nodes of the PR-tree. A PR-tree is then built with O((n/B) log n) I/Os bottom-u. Arge et al. [] further roose a bulk-loading strategy which lowers the I/O cost to O((n/B) log M/B (n/b)). The main issues of the PR-tree are that it lacks ractical efficiency in bulk-loading and answering queries with small query windows [0]. We also note that other satial indices such as kd-trees [], O-trees [], and cross-trees [] can offer a worst-case otimal query I/O erformance. Comared with R-trees created by our acking strategy, kd-trees are more difficult to bulkload in arallel. In the MPC model, Agarwal et al. [] roose a randomized algorithm that can bulk-load a kdtree with O(oly log s n) rounds of comutation. In contrast, we can bulk-load an R-tree with O(log s n) rounds of comutation which is lower, and our bulk-loading algorithm is deterministic. As for O-trees, they do not belong to the R- tree family. They combine multile auxiliary structures to ensure their theoretical guarantees. This aroach is mainly of theoretical interest, but in ractice is exensive in both sace consumtion and query cost (even being asymtotically otimal in the worst case). Cross-trees share a similar issue in its ractical query erformance []. These indices are not discussed further. Parallel R-tree management. Parallelism has been exloited to scale R-trees to large data sets and user grous. An early study [] considers storing an R-tree on a multidisk system. It stores a newly created tree node in the disk that contains the most dissimilar nodes to otimize the system throughut. A few studies [,, ] assume a shared nothing (client-server) architecture for distributed R-tree storing and query rocessing. Koudas et al. [] store the inner nodes on a server while the leaf nodes on clients. Schnitzer and Leutenegger [] further create local R-trees

4 Symbol P n d q k B h g s Table : Frequently Used Symbols Descrition A data set A data oint The cardinality of P The dimensionality of P A window query q The answer set size of a window query The node caacity of an R-tree The height of an R-tree The number of machines in a cluster The number of data oints allowed in a machine on clients for higher query efficiency. Mondal et al. [] study load balancing for R-trees in shared nothing systems. The studies above do not focus on arallel R-tree bulkloading. Paadooulos and Manolooulos [] roose a generic rocedure for arallel satial index bulk-loading. They use samling to estimate the data distribution which hels artition the data sace into regions. Data objects in different regions are assigned to different clients for local index building. A global index is built on the server which serves as a coordinator for query rocessing. A more recent study [0] bulk-loads an R-tree with the MaReduce framework level by level, where each level takes one MaReduce round. The study also uses an SFC for object ordering. However, its focus is on deciding the slit oints among tree nodes. The tree nodes can contain varying numbers of entries, which leads to non-otimal bulk-loading costs. Similar ideas have been used on GPUs [] without a cost analysis.. R-TREE PACKING We consider a set P of n data oints in a d-dimensional Euclidean sace. For ease of resentation, we use d = in the following discussion, although our aroach also alies to any d >. We focus on window query rocessing. Given a rectangle q, a window query reorts all the oints in P q. We list the frequently used symbols in Table.. Maing to Rank Sace y y q y e q y e 0 0 x e x e x 0 x Original sace Rank sace Figure : Maing to rank sace Before creating an index structure over P, we first ma the data oints into a -dimensional rank sace as follows. In each dimension of the original data sace, we sort the data oints by their coordinates and use the ranks as the coordinates in the corresonding dimension of the rank sace. Rank ties in a dimension are broken by the coordinates in the other dimension of the original sace. We assume no data oints with the same coordinates in both dimensions. Define by [n] the integer domain [0, n ]. After the maing, P becomes a set of n -dimensional oints in [n] such that no two oints share the same x- or y-coordinate. Figure illustrates the maing with a set of (n = ) oints P = {,,..., }. The coordinates of the oints in the rank sace are their ranks in the original sace. For examle, has the smallest x-coordinate and second largest y-coordinate in the original sace. Thus, its x-coordinate and y-coordinate in the rank sace are 0 and, resectively. Points and both have the second smallest x- coordinate in the original sace. In the rank sace, has an x-coordinate of while has an x-coordinate of, because has a smaller y-coordinate in the original sace. A query rectangle q = [x e, x e] [y e, y e] in the original sace is maed to a rectangle q = [x, x ] [y, y ] in [n]. Here, x (y ) is the smallest rank of the data oints whose x-coordinates (y-coordinates) in the original sace are greater than or equal to x e (y e); x (y ) is the largest rank of the data oints whose x-coordinates (y-coordinates) in the original sace are smaller than or equal to x e (y e). In Fig., the solid rectangle reresents a query rectangle q. In the original sace, the query range [x e, x e] sans,,, and in the x-dimension, among which ( ) has the smallest (largest) rank (). Thus, [x e, x e] is maed to [, ] in the rank sace. Similarly, the query range [y e, y e] is maed to [, ] in the rank sace. Note that the query maing does not introduce false ositives in the query answer because the data oints do not share the same coordinate in either dimension in the rank sace. Our goal is to store P in a structure so that all window queries can be answered efficiently in the worst case. Without loss of generality, we consider that n is a ower of.. Tree Structure and Packing Strategy Our structure is simly an R-tree where the leaf nodes are obtained by acking oints in ascending order of their Z-order [] values. Other sace-filling curves such as Hilbert curves can also be used (as will be discussed in Section.) but Z-order is used for illustration. For each oint P, we comute its Z-order value Z() in [n]. Suose that = (x, y), where x = α α...α l and y = β β...β l in binary form where l = log n. Then, Z() = β α β α...β l α l. We sort all the oints of P by their Z-order values, and cut the sorted list into subsequences, each of which has length B, excet ossibly the last subsequence, where B is a arameter that controls how many oints fit in a leaf node of an R-tree. Each leaf node includes the oints in a subsequence. The inner nodes of the R-tree are created by acking every B child nodes into a arent node (excet ossibly the last node in each level) recursively from the bottom to the to of the tree. This rocess resembles how a B-tree is created, excet that an inner node entry stores a ointer to a child node and its MBR instead of a key value. This creates our target R-tree. Figure illustrates the R-tree acking strategy. The rank sace can be seen as an grid. A Z-curve (the dotted olyline) is drawn across the rank sace. The order that a cell is reached by the curve is the Z-order value of the data oint in the cell, e.g., in Fig. a, is in the second cell reached by the curve; its Z-order value is, which is

5 y (0) q 0 0 () () () () (9) (a) () () x y 0 N T (0) () N q () () N () () N (9) N () N 0 x (b) N N T N N N N (c) Figure : R-tree acking labeled in arentheses next to (same for the other oints). Based on the Z-order values, the data oints are sorted as:,,,,,,,. We use B = in this examle. The eight data oints are acked into four leaf nodes: N =,, N =,, N =,, and N =,. These four leaf nodes are then acked in the order of the Z- order values of the data oints stored in them, resulting in two inner nodes N = N, N and N = N, N. A root node is further created to oint to N and N. Figures b and c show the MBRs and the R-tree T created. Algorithm : Build-R-tree Inut: P = {,,..., n}: a d-dimensional database; B: the caacity of a tree node. Outut: T : an R-tree over P. Ma P into the rank sace; for each i P in the rank sace do Comute Z-order value of i ; Sort P in ascending order of the Z-order values; Let Q ; for every B data oints in the sorted P do Create a leaf node N to store the B data oints; Q.enqueue( N, ); 9 while Q.size() > do 0 Dequeue the first B nodes of the same level t from Q; Create a node N to store MBRs (ointers) of the nodes; Q.enqueue( N, t + ); Let T oint to the last node in Q; return T ; Algorithm summarizes the the roosed R-tree acking strategy with the hel of an auxiliary queue Q. The queue stores -tules in the form of N, t where N and t are a tree node and its level in the tree, resectively. The acking strategy takes (i) two sorts on the data oints to ma them into the rank sace (Line ), (ii) a linear scan on the data oints to comute their Z-order values (Lines and ), (iii) another sort on the Z-order values (Line ), and (iv) another linear scan on the data oints (Lines to ) and log B n rounds of linear scans on the MBRs of tree nodes for acking and loading an R-tree (Lines 9 to ). Together, this acking strategy takes O((n/B) log M/B (n/b)) I/Os to bulk-load an R-tree (the CPU time is O(n log n), noticing that the Z- order value of a oint can be calculated in O(log n) time). Sorting and linear scans can be easily arallelized. This suggests a simle arallel R-tree bulk-loading algorithm where everything boils down to sorting. We resent such an algorithm in Section.. Window Query Processing When a window query is issued, we first ma it to the rank sace following the rocedure described in Section.. To facilitate fast maing, we create two B-trees to store airs of oint coordinates in the original sace and corresonding coordinate in the rank sace, one for each dimension. Query maing using B-trees takes O(log B n) I/Os. The maed query is then answered by our R-tree in the same way as for a conventional R-tree. We omit the seudo-code of the query algorithm for conciseness. As an examle, in Fig. c, we show the search aths for rocessing query q in gray... Query Cost We rove that our R-tree answers a window query with O((n/B) /d + k/b) I/Os in the worst case, where k is the number of oints reorted. This query comlexity is known to be asymtotically otimal [, 0]. Let h log B n be the height of the tree. Label the levels of the tree as,,..., h bottom u. Consider any level t [, h]. Let l be any vertical line in [n]. We rove the following lemma, which is sufficient for establishing our claim. Lemma. The line l intersects the MBRs of O( n/b t ) nodes at level t. Proof. Intuitively, the MBR of a node intersects a line l when it covers data oints on both sides of l (e.g., in Fig. a, node N contains and on both sides of the dashed line l) or data oints on l (e.g., in N ). Such a node corresonds to a Z-curve segment that crosses or ends at l. Since different nodes corresond to non-overlaing curve segments (because the data oints are acked by ascending Z-order values), we derive the number of nodes intersecting l via the number of times that the Z-curve crosses l. Let m be the smallest ower of larger than or equal to nbt. Divide [n] into an (n/m) (n/m) grid denoted as G, where each cell has m locations in [n]. Note that the Z-curve traverses all the locations in a cell before moving to another, i.e., it never comes back to the same cell. We use Fig. a to illustrate the roof for the case where t =, i.e., at the leaf node level. It shows the four leaf

6 y 0 l 0 N (a) C G y N q N N N N 0 N x 0 x (b) C Figure : Window query I/O cost G N Tye : When u is in column C, the x-coordinate of any data oint in the subtree of the node is in the range of [a, b]. There are b a + = m distinct x-coordinates in the range, imlying m data oints in the range (recall that all oints have distinct x- coordinates). Each node at level t can index B t data oints. Thus, there are O( + m/b t ) nodes of Tye. In Fig. a, b a+ = 0+ =. The four data oints in the gray column can form at most m/b t = / = nodes fully contained in the column, although there is just one such node in this examle which is N. If does not exist then and will form another node fully contained in the column. nodes N, N, N, and N (the dashed rectangles) of an R- tree constructed. We have nb t = = which means m =. The rank sace is divided into an (/) (/) = grid, which is reresented by the black solid line grid in the figure. The Z-curve enters and leaves each cell once, e.g., for the to-left cell, the Z-curve enters at its bottom-left corner and leaves from its to-right corner. Let C = [a, b] [n] be the column of G that contains line l. In Fig. a, the vertical dashed line reresents l, which is in column C = [0, ] [] highlighted in gray. Define the line x = a as the left boundary of C, and the line x = b as the right boundary. Let u be a node whose MBR intersects l. Define X(u) as the x-range of the MBR of u. For examle, N intersects the line, and X(N ) = [0, ]. Such u can be one of the following tyes: Tye : a X(u) or b X(u), i.e., u overlas column C (cf. node N ). Tye : X(u) [a, b], i.e., u is inside column C (cf. node N ). We rove that there are at most n/m n/b t nodes of Tye and O( + m/b t ) = O( n/b t ) nodes of Tye, which comletes the roof of the lemma: Tye : Note that the Z-curve crosses the left boundary (i.e., enters column C) n/m times. This is because there are n/m cells of G in each column, and the curve enumerates all the locations of a cell of G before moving to another. In Fig. a, there are n/m = / = cells in column C. The curve enters these two cells once each. Since the data oints are sorted and acked into nodes by their curve values, there are at most n/m nodes that contain data oints on both sides of the left boundary. Otherwise, some of these nodes must have overlaing curve values, which is against our acking strategy. The same alies to the right boundary. Thus, there are at most n/m nodes of Tye. In the figure, N overlas the right boundary of the to cell of column C. It contains two data oints and on the two sides of the right boundary of this cell. The curve segment between them crosses the right boundary of the cell. Since the curve only leaves the cell once, there cannot be another node N that also overlas the right boundary of the cell. Otherwise, the two curve segments corresonding to N and N must overla, which violates our acking strategy. Discussion. For an R-tree with a height h log B n, line l interests the MBRs of O( n/b) + O( n/b ) O( n/b h ) = O( n/b) nodes. The above roof assumed n being a ower of and d =. When n is not a ower of, let ρ = log n. We enlarge the rank sace to [ ρ ]. Line l intersects O( ρ /B) nodes in the enlarged rank sace. We have O( ρ /B) = O( n/b) because ρ n. The above argument can also be generalized to an arbitrary fixed dimensionality d to rove that our query cost is bounded by O((n/B) /d + k/b) in the worst case. This will be roven in Section.., which roves an even stronger result that subsumes the aforementioned bound as a secial case... Query Sensitive Bound in Arbitrary Dimensionality Consider a query rectangle q = [x, y ] [x, y ]... [x d, y d ] in [n] d, where d is an arbitrary fixed dimensionality. For each i [, d], set λ i = y i x i +, and define Z i = {,,..., d} \ {i}, namely, Z i includes all the integers from to d excet i. We will rove a stronger version of our revious lemma: our structure answers the query in O(log B n + Λ /d /B /d + k/b) () I/Os where Λ = d i= j Z i λ j. In this bound, log B n, Λ /d /B /d, and k/b denote the costs to ma the query to rank sace (query q here is already maed), to find the nodes intersecting the query boundary, and to outut the oints inside the query, resectively. The bound looks a bit unusual such that it would hel to look at some secial cases: for d =, the query cost is O(log B n + (λ + λ )/B + k/b), while for d =, the cost becomes O(log B n + (λ λ + λ λ + λ λ ) / /B / + k/b). Since λ i n for all i [, n], it always holds that j Z i λ j n d and Λ d n d. Thus, Equation () is bounded by O((d n d ) /d /B /d +k/b) = O((n/B) /d + k/b). In other words, Equation () is never worse than the (query insensitive) bound established in Section.., but could be substantially better when q is small. We say that the MBR of a node artially intersects q if it has a non-emty intersection with q, but is not contained by q. We will rove the following lemma, which is sufficient for establishing our claim. Lemma. The query rectangle q artially intersects the MBRs of O( + Λ /d /(B t ) /d ) nodes at level t.

7 Proof. Let m be the smallest ower of at least (Λ B t ) /d. Divide [n] d into an (n/m) (n/m)... (n/m) }{{} d grid G, where each cell has m d locations in [n] d (the cell s rojection on each dimension covers m values). For each i [, d], define a dimension-i column of G as the maximal set of cells in G that have the same rojection on dimension i. Grid G has n/m dimension-i columns, each of which is a d-dimensional rectangle in [n] d that covers the entire range [n] on every dimension j i. We use Fig. b to illustrate the roof, where d = and t =, i.e., at the leaf node level. We have q = [, ] [, ] (the solid line rectangle) and hence λ = + = and λ = + =. Thus, m = (Λ B t ) /d = (λ + λ )B t = ( + ) =. The rank sace is divided into an (/) (/) = grid, which is reresented by the black solid line grid. Grid G has / = dimension- (xdimension) columns, i.e., the two vertical columns. A node whose MBR artially intersects q must intersect one of the d boundary faces of q (e.g., edges of q in Fig. b). We will rove that there can be at most O( + m/b t + j Z i λ j/m d ) nodes intersecting each of the two faces of q erendicular to dimension i. Summing this u on all d dimensions gives an uer bound on the total number of nodes that artially intersect q: d i= O( + m/bt + j Z i λ j/m d ) = O(d + dm/b t + Λ/m d ) = O( + Λ /d /(B t ) /d ) as desired in the lemma. Due to symmetry, it suffices to consider the face of q that corresonds to x (i.e., erendicular to dimension ) we refer to it as the dimension- left face of q. Let C be the dimension- column of G that covers this face; C is a rectangle that can be written as [a, b] [n] [n]... [n] }{{} d some a, b satisfying b a+ = m and b is a multile of. In Fig. b, dimension is the x-dimension, and C is the gray column [0, ] []. Define the left boundary (or right boundary) of C to be the set of oints in [n] d with coordinate a (or b, resectively) on dimension. Let u be a level-t node with an MBR intersecting the dimension- left face of q, and X(u) be the rojection of the MBR of u on dimension, e.g., N intersects the left edge of q, and X(N ) = [0, ]. Such u can be of the following tyes: Tye : a X(u) or b X(u). Tye : X(u) [a, b]. Next, we analyze the number of nodes for each tye. Tye : The Z-curve crosses the left boundary of C at most O( d j= λi/m ) = O(+λλ...λ d /m d ) times within the dimension- left face of q. This is because there are O( d j= λi/m ) cells of C within the range of q in dimensions to d (e.g., in Fig. b, there are +λ /m = +/ = cells of C within the dimension range [, ] of q), and the curve enumerates all the locations of a cell before moving to another cell. By the reasoning exlained in the roof of Lemma, there are at most O( d j= λi/m ) nodes containing data oints on both sides of the left boundary. The same alies to for the right boundary. Therefore, the number of Tye- nodes is O( + λ λ...λ d /m d ). Tye : All the B t oints in the subtree of node u must have x-coordinates between a and b. There can be only b a+ = m such oints (recall that all oints have distinct coordinates in each dimension), imlying at most O( + m/b t ) such nodes. It thus follows that the dimension- left face of q intersects the MBRs of O( + m/b t + j Z i λ j/m d ) nodes at level t. This comletes the roof.. Extending to Other Sace Filling Curves Although we have used the Z-curve as the reresentative SFC, the only roerty that we require from the Z-curve is the following quad-tree recursive attern. Divide the data sace [n] d (where n is a ower of ) into d rectangles of the same size, i.e., each rectangle is a d-dimensional square with a rojection length of n/ on each dimension (recall how the root of a d-dimensional quad-tree would artition the sace). For examle, in Fig. a, grid G is divided into = squares (cells) each with an edge length of. The quad-tree recursive attern says that the SFC must first enumerate all the oints within a rectangle before starting to enumerate the oints of another. In Fig. a, the Z-curve enumerates the oints of the bottom-left cell before moving to the bottom-right cell. The attern must be followed recursively within each rectangle, by treating it as a smaller data sace [n/] d. All our roofs hold verbatim on any SFCs (e.g., the Hilbert curve) that obey this attern.. PARALLEL R-TREE BULK LOADING Next, we resent a arallel R-tree bulk-loading algorithm based on our acking strategy. A straightforward arallel algorithm that bulk-loads an R-tree level by level requires O(log B n) rounds of arallel comutation. We show how to reduce the number of rounds to O(log s n) without sacrificing the comutation time. Here, s denotes the number of data oints that a machine articiating in the arallel algorithm can handle. Modern machines can easily handle millions of data oints, where log s n is tyically bounded by a constant. The key idea of the roosed algorithm is to distribute the data oints (or MBRs of tree nodes) in a way that the machines can bulk-load O(log B s) levels of the final R-tree in each round. Then, O(log s n) such rounds suffice to build the entire R-tree of log B s log s n = log B n levels. To bulkload log B s levels in each round, a machine is assigned a subset of the data oints (MBRs) that forms a few R-tree branches of log B s levels indeendently from the data assigned to the other machines. This is feasible because we can assign data oints to the machines in their sorted order for acking indeendently.. Parallel Comutation Model Without relying on a articular arallel latform such as Aache Hadoo, we design the arallel bulk-loading algorithm based on a generalized arallel model named the massively arallel communication (MPC) model [,, ]. Poular arallel frameworks such as MaReduce [] and Sark [9] are tyical examles of this model. The MPC model makes the following assumtions. Let n be the inut size, g be the number of machines, and s = n/g.

8 In each round of arallel comutation, every machine receives some data from other machines, erforms comutation, and sends some data to other machines. We consider only algorithms that require a machine to receive/send O(s) words of information in each round (with the terminology of [], these algorithms must have load O(s)). MPC algorithms are measured by: (i) the number of comutation rounds R, (ii) the (arallel) running time T, and (iii) the total amount of comutation W. The running time sums u the maximum comutation cost of a single machine in each round; the total amount of comutation sums u the comutation costs of all machines in all rounds. Let t Mi,r be the time comlexity of machine M i in round r. Then, R R g T = max i..g tm i,r W = r= r= i= t Mi,r For the urose of building an R-tree, W should not exceed the time comlexity O(n log n) for a single-machine imlementation of the roosed acking strategy; T should be O((n log n)/g) to achieve a seedu of g with g machines. A rimitive oeration we need is sorting. In the MPC model, sorting n elements (initially evenly distributed on the g machines) can be done in O(log s n) rounds, O((n log n)/g) running time, and O(n log n) total amount of comutation [] (see Tao et al. [] for a simle algorithm when s g ln(g n) holds). Maing n data oints to the rank sace and sorting by Z-order values thus can be done in O(log s n) rounds. This rocess takes O((n log n)/g) running time and O(n log n) total amount of comutation. We focus on acking the sorted data oints to form an R-tree next. R N N T N 9 N 0 N N R N N M N N N N N N M M Figure : Parallel R-tree bulk-loading. Distributed Packing Every round bulk-loads Θ(log B s) levels of the target R- tree. In the first round, O(s) consecutive data oints are assigned to a machine by the ascending order of their Z-order values, where an R-tree of Θ(log B s) levels is bulk-loaded locally. This creates O(n/s) R-trees. A second round bulkloads the next Θ(log B s) levels of the target R-tree over the root MBRs of those O(n/s) R-trees. For this urose, O(+ g/s) machines are used, each assigned O(s) root MBRs; this results in O(n/s ) tree roots. The above rocess reeats until the MBRs can all be bulk-loaded in a single machine (the M number of articiating machines decreases by a factor of Θ(s) each time, while each such machine is always assigned O(s) MBRs). A total of O(log B n/ log B s) = O(log s n) rounds are incurred, where O(s log s n) = O((n log s n)/g) running time and O(n) total amount of comutation are taken to comute the MBRs. Figure illustrates the rounds, where n =, B =, g =, and s =. A total of log s n = rounds are needed. Each round bulk-loads log B s = levels. In the first round R, every machine is assigned s = data oints. The machines bulk-load R-trees of levels locally. The MBRs of the roots of these local R-trees are bulk-loaded by a single machine M in the second round R. We omit the seudo-code of the arallel bulk-loading algorithm as it is similar to Algorithm, excet that now a machine handles O(s) data oints instead of n, and the loo to bulk-load an R-tree (Lines 9 to ) is broken into rounds.. UPDATE HANDLING Our R-tree structure can also allow dynamic data udates. An object to be removed can be simly deleted from our R-tree in the same way as for a conventional R-tree. An object to be inserted need to be first maed to the rank sace in a similar way to that of query window maing. Let be an object to be inserted and (x e, y e) be its coordinates in the original sace. If x e (y e) equals to the x-coordinates (y-coordinates) of some existing data oints, x e (y e) is simly maed to the largest x-rank (y-rank) of those data oints. Otherwise, x e (y e) is maed to two coordinates x and x (y and y ) in the rank sace. Here, x (y ) is the smallest rank of the data oints whose x- coordinates (y-coordinates) in the original sace are greater than x e (y e); x (y ) is the largest rank of the data oints whose x-coordinates (y-coordinates) in the original sace are smaller than x e (y e). The four coordinates x, x, y, y may form four oints (x, y ), (x, y ), (x, y ), and (x, y ) in the rank sace. To ensure no false negatives in query rocessing, object needs to be maed to all these oints. Then, it can be inserted into our R-tree in the same way as for a conventional R-tree. Allowing dynamic data udates comromises the otimal worst-case I/O cost. A eriodic rebuild of the R-tree is needed for otimal erformance. We target alications where the data set is relatively static, e.g., buildings in a ma tend not to change in every second. The arallel bulk-loading algorithm roosed in Section can be used to eriodically rebuild the R-tree to coe with data udates. As our exeriments show, it only takes. minutes to build an R-tree with 00 million data objects. This should satisfy the targeted alications.. EXPERIMENTS This section reorts exerimental results on comaring the bulk-loading and window query erformance of the R- tree constructed by the roosed strategy with those of the STR-tree [], Hilbert R-tree [0], TGS R-tree [], and PRtree [], which have been described in Section. We denote the baseline algorithms by STR, HR, TGS, and PR, resectively. We denote the roosed algorithm by ZR for that it builds an R-tree based on Z-order values. As discussed in Section., the roosed acking strategy is also alicable to other sace-filling curves such as the Hilbert curve. To demonstrate this alicability, we further

9 Table : Parameters and Their Settings Parameter Setting Distribution Uniform, Skew, Cluster n 0.M, M, M, 0M, 0M α,,,, 9 Query window area (%) 0.00, 0.0, 0.0, 0., 0.,, (a) Tiger data (b) Skew data imlement an R-tree based on the roosed acking strategy where the data oints are sorted and acked by their Hilbert order values in the rank sace. We denote this R-tree by HRR. This R-tree shares a similar structure with the Hilbert R-tree, excet that the data oints are maed to the rank sace before they are acked. Note that the query cost bounds derived in Section. hold for this tree. Following revious studies [,, 0, ], we focus on the I/O cost of the algorithms.. Exerimental Setu The window query exeriments are run on a -bit machine running Ubuntu.0 with a.0 GHz Intel i CPU, GB memory, and a 00 GB hard disk. We use Ke Yi s single-machine imlementation [] of the Hilbert R-tree, TGS R-tree, and PR-tree, which uses the TPIE library [] a C++ library that rovides APIs for imlementation of external memory algorithms and data structures. For ease of comarison, we also imlement a single-machine version of the STR-tree and the roosed HRR and ZR R-trees using TPIE. In all the R-tree structures, we use 0 bytes for each entry in a node. For an inner node entry, these 0 bytes include bytes for the coordinates ( bytes each) of an MBR and bytes for a ointer ointing to the disk block storing the corresonding child node. For a leaf node entry, these 0 bytes include bytes for an ID of a data oint and bytes for the coordinates also in the form of an MBR for ease of imlementation. We use a block size of KB, and the maximum fanout of a node is 0. The bulk-loading exeriments are run on the single machine described above and on a cluster. The arallel bulkloading algorithms are imlemented with Scala and run on Aache Sark..0-SNAPSHOT which also suorts the MaReduce model but is more efficient than Hadoo MaReduce. We use a virtual-node cluster created from an academic comuting cloud (Nectar []) running on OenStack. Each virtual-node has GB memory and cores running at. GHz. One of the nodes acts as the master and the other nodes act as slaves. Each core simulates a worker machine, and hence there are 0 worker machines in total, i.e., g = 0. The network bandwidth is u to 00 Mbs. We use Aache Hadoo..0 with Yarn as the resource manager. We use both real and synthetic data sets. The real data set contains,,9 rectangles (90 MB in size) reresenting geograhical features in eastern states of the USA extracted from the TIGER/Line 00SE data [9]. We use the center of the rectangles as our data oints. We denote this data set as Tiger. Figure a illustrates the data set. Synthetic data sets are generated with a sace domain of where the data set cardinality ranges from 0. to 0 million. We generate three grous of synthetic data sets, denoted as Uniform, Skew, and Cluster, resectively. The Uniform data sets contain data oints with uniform distribution. The Skew and Cluster data sets are (c) Cluster data Figure : Exerimental data generated following the PR-tree aer []. A Skew data set is generated from a Uniform data set by raising the y- coordinates to their owers, i.e., the coordinates of a randomly generated data oint are converted from (x, y) to (x, y α ) where α. Figure b illustrates a Skew data set where α = 9. The Cluster data set contains 0,000 clusters with centers evenly distributed on a horizontal line. Each cluster has,000 oints following a uniform distribution in a square around the center. Figure c illustrates a ortion of the Cluster data set with four clusters. We vary query window size, data set size, and data skewness in the exeriments. The exerimental arameters are summarized in Table, where default values are in bold.. Results We resent the exerimental results in this subsection... Window Query Processing We start with the window query erformance of the R- trees. We generate 00 queries at random locations in each exeriment. For ease of comarison, we follow revious studies [,, 0, ] and reort the average I/O cost er query relative to the outut size. Let the number of blocks read for a query be I and the outut size be k/b. We reort I/(k/B). Note that I/(k/B), i.e., we need to at least read all the blocks containing the data oints in the query answer. A smaller value of I/(k/B) is more referable. I/O cost Query window size (%) HR HRR PR STR TGS ZR Figure : Query I/O cost on Uniform data (varying query window size) Varying query window size. We first vary the area of the query window from 0.00% to % of the data sace. Figure shows the query I/O cost relative to the outut size 9

10 k/b over 0 million uniform data oints. A general observation is that the relative query costs of the R-trees created by the various acking strategies decrease as the query window area increases. This is because a larger query window overlaing a tree node has a better chance of overlaing the data oints in this node, i.e., there are lower ercentages of extra query I/Os that do not contribute to the outut. Meanwhile, the R-trees HRR and ZR created by the roosed acking strategy have the smallest query I/O costs. The advantage is more significant with smaller query windows, e.g., when the query window area is 0.00% of the data sace, the query I/O costs of HRR and ZR are 0% and % lower than that of PR, resectively (note the logarithmic scale). HRR and ZR also outerform HR, STR, and TGS. This demonstrates that HRR and ZR not only have an asymtotically otimal cost in the worst case but also erform well in other cases. HRR in articular outerforms HR by u to % when the query window area is 0.00%. This advantage attributes to the rank sace maing before acking the data oints. The roosed acking strategy effectively incororates the designs of both HR and STR, which have reorted good erformance on non-extreme data. We also notice that HRR outerforms ZR. This suggests that when acking data oints in the same rank sace, the Hilbert curve yields a better acked R-tree than the Z-curve does. This result is consistent with an earlier study [] that comares the query erformance of R-trees acked with the Hilbert curve and the Z-curve in the same Euclidean sace. To hel further understand the benefit of the roosed acking strategy, we list the average outut size (k/b) er query as follows. For the seven different query window areas tested (i.e., from 0.00% to % of the data sace size), the outut sizes are.9, 9.,., 9.9,., 9.0, and.0, resectively. Based on these outut sizes and the relative query I/O costs shown in Fig., we can derive the absolute query I/O costs of the different R-trees. For examle, at query window area being %, the relative query I/O costs of HRR, PR, and ZR are.09,., and., which corresond to 9.9 (.0.09), 0.0 (.0.), and 0.09 (.0.) I/Os, resectively. This means that HRR and ZR have. and. fewer I/Os than PR, resectively. While these numbers might seem small, they are imrovements er query. For target alications such as digital maing, there can be millions of user queries to be rocessed at the same time. The accumulated benefit of HRR and ZR over such a large number of queries is non-trivial. Varying data set cardinality. Next, we vary the data set cardinality from 0. to 0 million while keeing the query window size at 0.0% of the data sace (outut size ranging from 0.9 to 9.). We see from Fig. that HRR and ZR outerform PR consistently by reducing about 0% and 0% of the relative query I/O costs, resectively. For fairness, the PR-tree is designed for rectangles. It may not be otimal on oint data for which HRR and ZR are designed. HR and STR have closer query erformance to that of HRR and ZR, for that they share similar design with HRR and ZR. However, ZR still has a lower cost when the data set cardinality exceeds one million, while HRR has the lowest cost constantly. This again demonstrates the advantage of using the rank sace for indexing. The erformance of TGS fluctuates the most. It relies on a heuristic otimization function when acking the data oints, i.e., minimizing the area of the MBRs, which may not be otimal for all cases. I/O cost 0 9 Skewness HR HRR PR STR TGS ZR Figure 9: Query I/O cost on Skew data Varying skewness. Figure 9 shows the query I/O erformance on Skew data where the skewness arameter α is varied from to 9 (outut size ranging from 9. to.). As the skewness increases, the relative I/O costs of all the R-trees increase excet for the TGS R-tree. This is natural as more skewed data makes it more difficult to reduce overlas between the MBRs of different tree nodes, leading to more blocks being accessed for query rocessing. TGS is an excetion. Its MBR minimizing otimization function leads to better acking for more skewed data. Regardless of this, HRR and ZR still outerform TGS and the other acking strategies. Their relative query I/O costs are u to % and 0% lower than those of PR, resectively. STR is again the closest for that it shares a similar acking strategy with ZR. I/O cost HR HRR PR STR TGS ZR I/O cost HR HRR PR STR TGS ZR Data set size (million) Figure : Query I/O cost on Uniform data (varying data set cardinality) Query window size (%) Figure 0: Query I/O cost on Tiger data 0

An Introduction To Range Searching

An Introduction To Range Searching An Introduction To Range Searching Jan Vahrenhold eartment of Comuter Science Westfälische Wilhelms-Universität Münster, Germany. Overview 1. Introduction: Problem Statement, Lower Bounds 2. Range Searching

More information

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment

More information

The Graph Accessibility Problem and the Universality of the Collision CRCW Conflict Resolution Rule

The Graph Accessibility Problem and the Universality of the Collision CRCW Conflict Resolution Rule The Grah Accessibility Problem and the Universality of the Collision CRCW Conflict Resolution Rule STEFAN D. BRUDA Deartment of Comuter Science Bisho s University Lennoxville, Quebec J1M 1Z7 CANADA bruda@cs.ubishos.ca

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

Topic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar

Topic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar 15-859(M): Randomized Algorithms Lecturer: Anuam Guta Toic: Lower Bounds on Randomized Algorithms Date: Setember 22, 2004 Scribe: Srinath Sridhar 4.1 Introduction In this lecture, we will first consider

More information

GIVEN an input sequence x 0,..., x n 1 and the

GIVEN an input sequence x 0,..., x n 1 and the 1 Running Max/Min Filters using 1 + o(1) Comarisons er Samle Hao Yuan, Member, IEEE, and Mikhail J. Atallah, Fellow, IEEE Abstract A running max (or min) filter asks for the maximum or (minimum) elements

More information

Outline. EECS150 - Digital Design Lecture 26 Error Correction Codes, Linear Feedback Shift Registers (LFSRs) Simple Error Detection Coding

Outline. EECS150 - Digital Design Lecture 26 Error Correction Codes, Linear Feedback Shift Registers (LFSRs) Simple Error Detection Coding Outline EECS150 - Digital Design Lecture 26 Error Correction Codes, Linear Feedback Shift Registers (LFSRs) Error detection using arity Hamming code for error detection/correction Linear Feedback Shift

More information

Unit 1 - Computer Arithmetic

Unit 1 - Computer Arithmetic FIXD-POINT (FX) ARITHMTIC Unit 1 - Comuter Arithmetic INTGR NUMBRS n bit number: b n 1 b n 2 b 0 Decimal Value Range of values UNSIGND n 1 SIGND D = b i 2 i D = 2 n 1 b n 1 + b i 2 i n 2 i=0 i=0 [0, 2

More information

Analysis of execution time for parallel algorithm to dertmine if it is worth the effort to code and debug in parallel

Analysis of execution time for parallel algorithm to dertmine if it is worth the effort to code and debug in parallel Performance Analysis Introduction Analysis of execution time for arallel algorithm to dertmine if it is worth the effort to code and debug in arallel Understanding barriers to high erformance and redict

More information

A Parallel Algorithm for Minimization of Finite Automata

A Parallel Algorithm for Minimization of Finite Automata A Parallel Algorithm for Minimization of Finite Automata B. Ravikumar X. Xiong Deartment of Comuter Science University of Rhode Island Kingston, RI 02881 E-mail: fravi,xiongg@cs.uri.edu Abstract In this

More information

Linear diophantine equations for discrete tomography

Linear diophantine equations for discrete tomography Journal of X-Ray Science and Technology 10 001 59 66 59 IOS Press Linear diohantine euations for discrete tomograhy Yangbo Ye a,gewang b and Jiehua Zhu a a Deartment of Mathematics, The University of Iowa,

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules. Introduction: The is widely used in industry to monitor the number of fraction nonconforming units. A nonconforming unit is

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

New Schedulability Test Conditions for Non-preemptive Scheduling on Multiprocessor Platforms

New Schedulability Test Conditions for Non-preemptive Scheduling on Multiprocessor Platforms New Schedulability Test Conditions for Non-reemtive Scheduling on Multirocessor Platforms Technical Reort May 2008 Nan Guan 1, Wang Yi 2, Zonghua Gu 3 and Ge Yu 1 1 Northeastern University, Shenyang, China

More information

The Value of Even Distribution for Temporal Resource Partitions

The Value of Even Distribution for Temporal Resource Partitions The Value of Even Distribution for Temoral Resource Partitions Yu Li, Albert M. K. Cheng Deartment of Comuter Science University of Houston Houston, TX, 7704, USA htt://www.cs.uh.edu Technical Reort Number

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell February 10, 2010 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

John Weatherwax. Analysis of Parallel Depth First Search Algorithms

John Weatherwax. Analysis of Parallel Depth First Search Algorithms Sulementary Discussions and Solutions to Selected Problems in: Introduction to Parallel Comuting by Viin Kumar, Ananth Grama, Anshul Guta, & George Karyis John Weatherwax Chater 8 Analysis of Parallel

More information

On the Chvatál-Complexity of Knapsack Problems

On the Chvatál-Complexity of Knapsack Problems R u t c o r Research R e o r t On the Chvatál-Comlexity of Knasack Problems Gergely Kovács a Béla Vizvári b RRR 5-08, October 008 RUTCOR Rutgers Center for Oerations Research Rutgers University 640 Bartholomew

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

Improved Capacity Bounds for the Binary Energy Harvesting Channel

Improved Capacity Bounds for the Binary Energy Harvesting Channel Imroved Caacity Bounds for the Binary Energy Harvesting Channel Kaya Tutuncuoglu 1, Omur Ozel 2, Aylin Yener 1, and Sennur Ulukus 2 1 Deartment of Electrical Engineering, The Pennsylvania State University,

More information

Parallelism and Locality in Priority Queues. A. Ranade S. Cheng E. Deprit J. Jones S. Shih. University of California. Berkeley, CA 94720

Parallelism and Locality in Priority Queues. A. Ranade S. Cheng E. Deprit J. Jones S. Shih. University of California. Berkeley, CA 94720 Parallelism and Locality in Priority Queues A. Ranade S. Cheng E. Derit J. Jones S. Shih Comuter Science Division University of California Berkeley, CA 94720 Abstract We exlore two ways of incororating

More information

A randomized sorting algorithm on the BSP model

A randomized sorting algorithm on the BSP model A randomized sorting algorithm on the BSP model Alexandros V. Gerbessiotis a, Constantinos J. Siniolakis b a CS Deartment, New Jersey Institute of Technology, Newark, NJ 07102, USA b The American College

More information

On the Toppling of a Sand Pile

On the Toppling of a Sand Pile Discrete Mathematics and Theoretical Comuter Science Proceedings AA (DM-CCG), 2001, 275 286 On the Toling of a Sand Pile Jean-Christohe Novelli 1 and Dominique Rossin 2 1 CNRS, LIFL, Bâtiment M3, Université

More information

Feedback-error control

Feedback-error control Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller

More information

16. Binary Search Trees

16. Binary Search Trees Dictionary imlementation 16. Binary Search Trees [Ottman/Widmayer, Ka..1, Cormen et al, Ka. 1.1-1.] Hashing: imlementation of dictionaries with exected very fast access times. Disadvantages of hashing:

More information

Recursive Estimation of the Preisach Density function for a Smart Actuator

Recursive Estimation of the Preisach Density function for a Smart Actuator Recursive Estimation of the Preisach Density function for a Smart Actuator Ram V. Iyer Deartment of Mathematics and Statistics, Texas Tech University, Lubbock, TX 7949-142. ABSTRACT The Preisach oerator

More information

Analysis of some entrance probabilities for killed birth-death processes

Analysis of some entrance probabilities for killed birth-death processes Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction

More information

Analysis of Multi-Hop Emergency Message Propagation in Vehicular Ad Hoc Networks

Analysis of Multi-Hop Emergency Message Propagation in Vehicular Ad Hoc Networks Analysis of Multi-Ho Emergency Message Proagation in Vehicular Ad Hoc Networks ABSTRACT Vehicular Ad Hoc Networks (VANETs) are attracting the attention of researchers, industry, and governments for their

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell October 25, 2009 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

16. Binary Search Trees

16. Binary Search Trees Dictionary imlementation 16. Binary Search Trees [Ottman/Widmayer, Ka..1, Cormen et al, Ka. 12.1-12.] Hashing: imlementation of dictionaries with exected very fast access times. Disadvantages of hashing:

More information

The non-stochastic multi-armed bandit problem

The non-stochastic multi-armed bandit problem Submitted for journal ublication. The non-stochastic multi-armed bandit roblem Peter Auer Institute for Theoretical Comuter Science Graz University of Technology A-8010 Graz (Austria) auer@igi.tu-graz.ac.at

More information

Shadow Computing: An Energy-Aware Fault Tolerant Computing Model

Shadow Computing: An Energy-Aware Fault Tolerant Computing Model Shadow Comuting: An Energy-Aware Fault Tolerant Comuting Model Bryan Mills, Taieb Znati, Rami Melhem Deartment of Comuter Science University of Pittsburgh (bmills, znati, melhem)@cs.itt.edu Index Terms

More information

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES AARON ZWIEBACH Abstract. In this aer we will analyze research that has been recently done in the field of discrete

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

Elliptic Curves and Cryptography

Elliptic Curves and Cryptography Ellitic Curves and Crytograhy Background in Ellitic Curves We'll now turn to the fascinating theory of ellitic curves. For simlicity, we'll restrict our discussion to ellitic curves over Z, where is a

More information

Strong Matching of Points with Geometric Shapes

Strong Matching of Points with Geometric Shapes Strong Matching of Points with Geometric Shaes Ahmad Biniaz Anil Maheshwari Michiel Smid School of Comuter Science, Carleton University, Ottawa, Canada December 9, 05 In memory of Ferran Hurtado. Abstract

More information

An Analysis of Reliable Classifiers through ROC Isometrics

An Analysis of Reliable Classifiers through ROC Isometrics An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit

More information

Principles of Computed Tomography (CT)

Principles of Computed Tomography (CT) Page 298 Princiles of Comuted Tomograhy (CT) The theoretical foundation of CT dates back to Johann Radon, a mathematician from Vienna who derived a method in 1907 for rojecting a 2-D object along arallel

More information

Model checking, verification of CTL. One must verify or expel... doubts, and convert them into the certainty of YES [Thomas Carlyle]

Model checking, verification of CTL. One must verify or expel... doubts, and convert them into the certainty of YES [Thomas Carlyle] Chater 5 Model checking, verification of CTL One must verify or exel... doubts, and convert them into the certainty of YES or NO. [Thomas Carlyle] 5. The verification setting Page 66 We introduce linear

More information

A Special Case Solution to the Perspective 3-Point Problem William J. Wolfe California State University Channel Islands

A Special Case Solution to the Perspective 3-Point Problem William J. Wolfe California State University Channel Islands A Secial Case Solution to the Persective -Point Problem William J. Wolfe California State University Channel Islands william.wolfe@csuci.edu Abstract In this aer we address a secial case of the ersective

More information

Elementary Analysis in Q p

Elementary Analysis in Q p Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

Lilian Markenzon 1, Nair Maria Maia de Abreu 2* and Luciana Lee 3

Lilian Markenzon 1, Nair Maria Maia de Abreu 2* and Luciana Lee 3 Pesquisa Oeracional (2013) 33(1): 123-132 2013 Brazilian Oerations Research Society Printed version ISSN 0101-7438 / Online version ISSN 1678-5142 www.scielo.br/oe SOME RESULTS ABOUT THE CONNECTIVITY OF

More information

A generalization of Amdahl's law and relative conditions of parallelism

A generalization of Amdahl's law and relative conditions of parallelism A generalization of Amdahl's law and relative conditions of arallelism Author: Gianluca Argentini, New Technologies and Models, Riello Grou, Legnago (VR), Italy. E-mail: gianluca.argentini@riellogrou.com

More information

Metrics Performance Evaluation: Application to Face Recognition

Metrics Performance Evaluation: Application to Face Recognition Metrics Performance Evaluation: Alication to Face Recognition Naser Zaeri, Abeer AlSadeq, and Abdallah Cherri Electrical Engineering Det., Kuwait University, P.O. Box 5969, Safat 6, Kuwait {zaery, abeer,

More information

Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition

Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition Haiing Lu, K.N. Plataniotis and A.N. Venetsanooulos The Edward S. Rogers Sr. Deartment of

More information

SAT based Abstraction-Refinement using ILP and Machine Learning Techniques

SAT based Abstraction-Refinement using ILP and Machine Learning Techniques SAT based Abstraction-Refinement using ILP and Machine Learning Techniques 1 SAT based Abstraction-Refinement using ILP and Machine Learning Techniques Edmund Clarke James Kukula Anubhav Guta Ofer Strichman

More information

PARALLEL MATRIX MULTIPLICATION: A SYSTEMATIC JOURNEY

PARALLEL MATRIX MULTIPLICATION: A SYSTEMATIC JOURNEY PARALLEL MATRIX MULTIPLICATION: A SYSTEMATIC JOURNEY MARTIN D SCHATZ, ROBERT A VAN DE GEIJN, AND JACK POULSON Abstract We exose a systematic aroach for develoing distributed memory arallel matrix matrix

More information

Modeling and Estimation of Full-Chip Leakage Current Considering Within-Die Correlation

Modeling and Estimation of Full-Chip Leakage Current Considering Within-Die Correlation 6.3 Modeling and Estimation of Full-Chi Leaage Current Considering Within-Die Correlation Khaled R. eloue, Navid Azizi, Farid N. Najm Deartment of ECE, University of Toronto,Toronto, Ontario, Canada {haled,nazizi,najm}@eecg.utoronto.ca

More information

Fig. 4. Example of Predicted Workers. Fig. 3. A Framework for Tackling the MQA Problem.

Fig. 4. Example of Predicted Workers. Fig. 3. A Framework for Tackling the MQA Problem. 217 IEEE 33rd International Conference on Data Engineering Prediction-Based Task Assignment in Satial Crowdsourcing Peng Cheng #, Xiang Lian, Lei Chen #, Cyrus Shahabi # Hong Kong University of Science

More information

Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition

Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition TNN-2007-P-0332.R1 1 Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition Haiing Lu, K.N. Plataniotis and A.N. Venetsanooulos The Edward S. Rogers

More information

Hotelling s Two- Sample T 2

Hotelling s Two- Sample T 2 Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test

More information

PARALLEL MATRIX MULTIPLICATION: A SYSTEMATIC JOURNEY

PARALLEL MATRIX MULTIPLICATION: A SYSTEMATIC JOURNEY PARALLEL MATRIX MULTIPLICATION: A SYSTEMATIC JOURNEY MARTIN D SCHATZ, ROBERT A VAN DE GEIJN, AND JACK POULSON Abstract We exose a systematic aroach for develoing distributed memory arallel matrixmatrix

More information

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management

More information

ON POLYNOMIAL SELECTION FOR THE GENERAL NUMBER FIELD SIEVE

ON POLYNOMIAL SELECTION FOR THE GENERAL NUMBER FIELD SIEVE MATHEMATICS OF COMPUTATIO Volume 75, umber 256, October 26, Pages 237 247 S 25-5718(6)187-9 Article electronically ublished on June 28, 26 O POLYOMIAL SELECTIO FOR THE GEERAL UMBER FIELD SIEVE THORSTE

More information

XReason: A Semantic Approach that Reasons with Patterns to Answer XML Keyword Queries

XReason: A Semantic Approach that Reasons with Patterns to Answer XML Keyword Queries XReason: A Semantic Aroach that Reasons with Patterns to Answer XML Keyword Queries Cem Aksoy 1, Aggeliki Dimitriou 2, Dimitri Theodoratos 1, Xiaoying Wu 3 1 New Jersey Institute of Technology, Newark,

More information

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers

More information

Probability Estimates for Multi-class Classification by Pairwise Coupling

Probability Estimates for Multi-class Classification by Pairwise Coupling Probability Estimates for Multi-class Classification by Pairwise Couling Ting-Fan Wu Chih-Jen Lin Deartment of Comuter Science National Taiwan University Taiei 06, Taiwan Ruby C. Weng Deartment of Statistics

More information

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS #A13 INTEGERS 14 (014) ON THE LEAST SIGNIFICANT ADIC DIGITS OF CERTAIN LUCAS NUMBERS Tamás Lengyel Deartment of Mathematics, Occidental College, Los Angeles, California lengyel@oxy.edu Received: 6/13/13,

More information

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points.

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points. Solved Problems Solved Problems P Solve the three simle classification roblems shown in Figure P by drawing a decision boundary Find weight and bias values that result in single-neuron ercetrons with the

More information

State Estimation with ARMarkov Models

State Estimation with ARMarkov Models Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,

More information

Matching Partition a Linked List and Its Optimization

Matching Partition a Linked List and Its Optimization Matching Partition a Linked List and Its Otimization Yijie Han Deartment of Comuter Science University of Kentucky Lexington, KY 40506 ABSTRACT We show the curve O( n log i + log (i) n + log i) for the

More information

Research Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces

Research Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces Abstract and Alied Analysis Volume 2012, Article ID 264103, 11 ages doi:10.1155/2012/264103 Research Article An iterative Algorithm for Hemicontractive Maings in Banach Saces Youli Yu, 1 Zhitao Wu, 2 and

More information

Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models

Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Ketan N. Patel, Igor L. Markov and John P. Hayes University of Michigan, Ann Arbor 48109-2122 {knatel,imarkov,jhayes}@eecs.umich.edu

More information

Joint Property Estimation for Multiple RFID Tag Sets Using Snapshots of Variable Lengths

Joint Property Estimation for Multiple RFID Tag Sets Using Snapshots of Variable Lengths Joint Proerty Estimation for Multile RFID Tag Sets Using Snashots of Variable Lengths ABSTRACT Qingjun Xiao Key Laboratory of Comuter Network and Information Integration Southeast University) Ministry

More information

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS Proceedings of DETC 03 ASME 003 Design Engineering Technical Conferences and Comuters and Information in Engineering Conference Chicago, Illinois USA, Setember -6, 003 DETC003/DAC-48760 AN EFFICIENT ALGORITHM

More information

INTRODUCTION. Please write to us at if you have any comments or ideas. We love to hear from you.

INTRODUCTION. Please write to us at if you have any comments or ideas. We love to hear from you. Casio FX-570ES One-Page Wonder INTRODUCTION Welcome to the world of Casio s Natural Dislay scientific calculators. Our exeriences of working with eole have us understand more about obstacles eole face

More information

RESOLUTIONS OF THREE-ROWED SKEW- AND ALMOST SKEW-SHAPES IN CHARACTERISTIC ZERO

RESOLUTIONS OF THREE-ROWED SKEW- AND ALMOST SKEW-SHAPES IN CHARACTERISTIC ZERO RESOLUTIONS OF THREE-ROWED SKEW- AND ALMOST SKEW-SHAPES IN CHARACTERISTIC ZERO MARIA ARTALE AND DAVID A. BUCHSBAUM Abstract. We find an exlicit descrition of the terms and boundary mas for the three-rowed

More information

Optimal Approximate Polytope Membership

Optimal Approximate Polytope Membership Otimal Aroximate Polytoe Membershi Sunil Arya Deartment of Comuter Science and Engineering The Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong arya@cse.ust.hk Abstract

More information

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deriving ndicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deutsch Centre for Comutational Geostatistics Deartment of Civil &

More information

Research of PMU Optimal Placement in Power Systems

Research of PMU Optimal Placement in Power Systems Proceedings of the 5th WSEAS/IASME Int. Conf. on SYSTEMS THEORY and SCIENTIFIC COMPUTATION, Malta, Setember 15-17, 2005 (38-43) Research of PMU Otimal Placement in Power Systems TIAN-TIAN CAI, QIAN AI

More information

Notes on Instrumental Variables Methods

Notes on Instrumental Variables Methods Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of

More information

Adaptive estimation with change detection for streaming data

Adaptive estimation with change detection for streaming data Adative estimation with change detection for streaming data A thesis resented for the degree of Doctor of Philosohy of the University of London and the Diloma of Imerial College by Dean Adam Bodenham Deartment

More information

Topic 7: Using identity types

Topic 7: Using identity types Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules

More information

A Social Welfare Optimal Sequential Allocation Procedure

A Social Welfare Optimal Sequential Allocation Procedure A Social Welfare Otimal Sequential Allocation Procedure Thomas Kalinowsi Universität Rostoc, Germany Nina Narodytsa and Toby Walsh NICTA and UNSW, Australia May 2, 201 Abstract We consider a simle sequential

More information

The Fekete Szegő theorem with splitting conditions: Part I

The Fekete Szegő theorem with splitting conditions: Part I ACTA ARITHMETICA XCIII.2 (2000) The Fekete Szegő theorem with slitting conditions: Part I by Robert Rumely (Athens, GA) A classical theorem of Fekete and Szegő [4] says that if E is a comact set in the

More information

Asymptotic Properties of the Markov Chain Model method of finding Markov chains Generators of..

Asymptotic Properties of the Markov Chain Model method of finding Markov chains Generators of.. IOSR Journal of Mathematics (IOSR-JM) e-issn: 78-578, -ISSN: 319-765X. Volume 1, Issue 4 Ver. III (Jul. - Aug.016), PP 53-60 www.iosrournals.org Asymtotic Proerties of the Markov Chain Model method of

More information

Computer arithmetic. Intensive Computation. Annalisa Massini 2017/2018

Computer arithmetic. Intensive Computation. Annalisa Massini 2017/2018 Comuter arithmetic Intensive Comutation Annalisa Massini 7/8 Intensive Comutation - 7/8 References Comuter Architecture - A Quantitative Aroach Hennessy Patterson Aendix J Intensive Comutation - 7/8 3

More information

19th Bay Area Mathematical Olympiad. Problems and Solutions. February 28, 2017

19th Bay Area Mathematical Olympiad. Problems and Solutions. February 28, 2017 th Bay Area Mathematical Olymiad February, 07 Problems and Solutions BAMO- and BAMO- are each 5-question essay-roof exams, for middle- and high-school students, resectively. The roblems in each exam are

More information

Brownian Motion and Random Prime Factorization

Brownian Motion and Random Prime Factorization Brownian Motion and Random Prime Factorization Kendrick Tang June 4, 202 Contents Introduction 2 2 Brownian Motion 2 2. Develoing Brownian Motion.................... 2 2.. Measure Saces and Borel Sigma-Algebras.........

More information

Multiplicative group law on the folium of Descartes

Multiplicative group law on the folium of Descartes Multilicative grou law on the folium of Descartes Steluţa Pricoie and Constantin Udrişte Abstract. The folium of Descartes is still studied and understood today. Not only did it rovide for the roof of

More information

Numerical Methods for Particle Tracing in Vector Fields

Numerical Methods for Particle Tracing in Vector Fields On-Line Visualization Notes Numerical Methods for Particle Tracing in Vector Fields Kenneth I. Joy Visualization and Grahics Research Laboratory Deartment of Comuter Science University of California, Davis

More information

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i Comuting with Haar Functions Sami Khuri Deartment of Mathematics and Comuter Science San Jose State University One Washington Square San Jose, CA 9519-0103, USA khuri@juiter.sjsu.edu Fax: (40)94-500 Keywords:

More information

1 1 c (a) 1 (b) 1 Figure 1: (a) First ath followed by salesman in the stris method. (b) Alternative ath. 4. D = distance travelled closing the loo. Th

1 1 c (a) 1 (b) 1 Figure 1: (a) First ath followed by salesman in the stris method. (b) Alternative ath. 4. D = distance travelled closing the loo. Th 18.415/6.854 Advanced Algorithms ovember 7, 1996 Euclidean TSP (art I) Lecturer: Michel X. Goemans MIT These notes are based on scribe notes by Marios Paaefthymiou and Mike Klugerman. 1 Euclidean TSP Consider

More information

Efficient Approximations for Call Admission Control Performance Evaluations in Multi-Service Networks

Efficient Approximations for Call Admission Control Performance Evaluations in Multi-Service Networks Efficient Aroximations for Call Admission Control Performance Evaluations in Multi-Service Networks Emre A. Yavuz, and Victor C. M. Leung Deartment of Electrical and Comuter Engineering The University

More information

The Knuth-Yao Quadrangle-Inequality Speedup is a Consequence of Total-Monotonicity

The Knuth-Yao Quadrangle-Inequality Speedup is a Consequence of Total-Monotonicity The Knuth-Yao Quadrangle-Ineuality Seedu is a Conseuence of Total-Monotonicity Wolfgang W. Bein Mordecai J. Golin Lawrence L. Larmore Yan Zhang Abstract There exist several general techniues in the literature

More information

Eigenanalysis of Finite Element 3D Flow Models by Parallel Jacobi Davidson

Eigenanalysis of Finite Element 3D Flow Models by Parallel Jacobi Davidson Eigenanalysis of Finite Element 3D Flow Models by Parallel Jacobi Davidson Luca Bergamaschi 1, Angeles Martinez 1, Giorgio Pini 1, and Flavio Sartoretto 2 1 Diartimento di Metodi e Modelli Matematici er

More information

Universal Finite Memory Coding of Binary Sequences

Universal Finite Memory Coding of Binary Sequences Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv

More information

Maximum Entropy and the Stress Distribution in Soft Disk Packings Above Jamming

Maximum Entropy and the Stress Distribution in Soft Disk Packings Above Jamming Maximum Entroy and the Stress Distribution in Soft Disk Packings Above Jamming Yegang Wu and S. Teitel Deartment of Physics and Astronomy, University of ochester, ochester, New York 467, USA (Dated: August

More information

Robust Predictive Control of Input Constraints and Interference Suppression for Semi-Trailer System

Robust Predictive Control of Input Constraints and Interference Suppression for Semi-Trailer System Vol.7, No.7 (4),.37-38 htt://dx.doi.org/.457/ica.4.7.7.3 Robust Predictive Control of Inut Constraints and Interference Suression for Semi-Trailer System Zhao, Yang Electronic and Information Technology

More information

PER-PATCH METRIC LEARNING FOR ROBUST IMAGE MATCHING. Sezer Karaoglu, Ivo Everts, Jan C. van Gemert, and Theo Gevers

PER-PATCH METRIC LEARNING FOR ROBUST IMAGE MATCHING. Sezer Karaoglu, Ivo Everts, Jan C. van Gemert, and Theo Gevers PER-PATCH METRIC LEARNING FOR ROBUST IMAGE MATCHING Sezer Karaoglu, Ivo Everts, Jan C. van Gemert, and Theo Gevers Intelligent Systems Lab, Amsterdam, University of Amsterdam, 1098 XH Amsterdam, The Netherlands

More information

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction GOOD MODELS FOR CUBIC SURFACES ANDREAS-STEPHAN ELSENHANS Abstract. This article describes an algorithm for finding a model of a hyersurface with small coefficients. It is shown that the aroach works in

More information

Sums of independent random variables

Sums of independent random variables 3 Sums of indeendent random variables This lecture collects a number of estimates for sums of indeendent random variables with values in a Banach sace E. We concentrate on sums of the form N γ nx n, where

More information

Algorithms for Air Traffic Flow Management under Stochastic Environments

Algorithms for Air Traffic Flow Management under Stochastic Environments Algorithms for Air Traffic Flow Management under Stochastic Environments Arnab Nilim and Laurent El Ghaoui Abstract A major ortion of the delay in the Air Traffic Management Systems (ATMS) in US arises

More information

Finding recurrent sources in sequences

Finding recurrent sources in sequences Finding recurrent sources in sequences Aristides Gionis Deartment of Comuter Science Stanford University Stanford, CA, 94305, USA gionis@cs.stanford.edu Heikki Mannila HIIT Basic Research Unit Deartment

More information

Cryptography. Lecture 8. Arpita Patra

Cryptography. Lecture 8. Arpita Patra Crytograhy Lecture 8 Arita Patra Quick Recall and Today s Roadma >> Hash Functions- stands in between ublic and rivate key world >> Key Agreement >> Assumtions in Finite Cyclic grous - DL, CDH, DDH Grous

More information