# I/O-Efficient Computation of Water Flow Across a Terrain

Size: px
Start display at page:

Transcription

2 based on this idea has been presented in [14] In Section 32, we discuss an O(N log N) time implementation of this approach using a priority queue and a union-find structure This simulation approach does not translate into an I/Oefficient algorithm, as the only known I/O-efficient unionfind structure [1] requires the sequence of Union operations to be known in advance In simulating the sequence of spill events, the set of Union operations is known, but their order depends on the spill times of the basins Using an internal-memory union-find structure and an I/O-efficient priority queue (eg, [3, 12]), a cost of O(Xα(X) + sort(n)) disk accesses can be achieved, where α( ) is the inverse of Ackermann s function, sort(n) N is the I/O complexity of sorting N data items (see next section), and X is the number of pits of the terrain In contrast, the algorithm presented in this paper achieves an I/O complexity of O(sort(X) log(x/m) + sort(n)) disk accesses In the remainder of this section, we formally define the computational model we use, discuss previous work, and give an overview of our algorithm Section 2 introduces the terminology and notation we use throughout the paper Section 3 discusses our algorithm for computing the spill times of the terrain s basins and the flooding times of all terrain vertices This algorithm makes use of an I/O-efficient meldable priority queue, which we discuss in Section 4 11 I/O Model We use the I/O model with one (logical) disk [2] to design and analyze our algorithm In this model, the computer is equipped with a two-level memory hierarchy consisting of an internal memory and a (disk-based) external memory The internal memory is capable of holding M data items (vertices, edges, etc), while the external memory is of conceptually unlimited size All computation has to happen on data in internal memory Data is transferred between internal and external memory in blocks of B consecutive data items Such a transfer is referred to as an I/O operation or I/O The cost of an algorithm is the number of I/Os it performs The number of I/Os required to read N contiguous items from disk is N/B The number of I/Os required to sort N items is sort(n) = Θ((N/B) log M/B (N/B)) [2] For all realistic values of N, M, and B, we have N/B < sort(n) N 12 Related Work Due to its importance, the problem of computing water flow across a terrain (usually in the form of a river network) has been studied extensively Most existing methods for computing river networks assume that once water flows into a small basin of the terrain, it never flows out in other words, basins do not spill Therefore, to avoid water getting caught in small local basins, most flow modelling approaches first remove all basins by flooding the terrain, that is, conceptually pouring water onto the terrain until all basins are filled [4, 13, 15] However, this often leads to unrealistic flow patterns, since many important geographical features are removed Recent papers [1,5] suggested partial flooding algorithms that flood only small basins, where the size of a basin can be defined using different measures, such as height, volume or area Agarwal, Arge and Yi [1] described an O(sort(N)) I/O partial flooding algorithm that removes all basins of small height This is done by computing the topological persistence [10, 11] of each basin and removing the ones with persistence below a small threshold The key to obtaining an O(sort(N)) I/O solution to this problem is an algorithm that can process a sequence of N Union and Find operations using O(sort(N)) I/Os, assuming the entire sequence of operations is provided in advance Arge and Revsbæk [5] extended the result of [1] to removing basins based on different geometric measures, including volume and area Partial flooding methods provide a basis only for approximate solutions to flow modelling, as the underlying assumption is that all small basins are flooded at a certain time, while all big basins are not This assumption may not be true, as the spill time of a basin depends on its volume and its watershed, and the watershed of a basin may grow over time as a result of other basins spilling into it; in particular, the watershed of a big basin may grow to a size far exceeding the size of the watershed of a small basin and, thus, the big basin may spill before the small basin To model the flow network at time t accurately, it is necessary to compute the times the basins of T fill and remove the basins that are full at time t This is much harder than the partial flooding approaches discussed above, as the above methods are based on local measures associated with each basin, while the computation of the actual spill time of each basin β depends on the spill times of other basins that spill into β As mentioned in the introduction, Liu and Snoeyink [14] presented an internal-memory algorithm for computing actual spill times and used it for flow prediction While the paper does not present the algorithm in detail, we discuss an efficient implementation using a union-find structure and a priority queue in Section 32; this implementation is needed as part of our I/O-efficient algorithm We also require a tree structure that represents the nesting of the basins of T This tree has been termed the merge tree of T in [7,8] (also see Section 2) and can be computed using O(sort(N)) I/Os using the topological persistence algorithm of [1] This algorithm is easily augmented to compute the lowest saddle on the boundary of each basin and, as shown in [5], the volume of each basin Computing the watershed sizes of all basins of T is harder The watershed sizes of all basins of T are easily computed from the watershed sizes of all pits by summing watershed sizes bottom up in the merge tree However, the only I/Oefficient algorithm for computing the watersheds of all pits of a terrain exactly is the one of [9] This algorithm performs O(sort(N + S)) I/Os for fat terrains, where S is the size of the terrain s strip map The size of the strip map of a fat terrain is Θ(N 2 ) in the worst case [9], in which case the exact computation of watersheds becomes infeasible An approach often taken in practice [16 18] is to assume that water flows only along terrain edges, allowing the computation of watersheds using O(sort(N)) I/Os While this may lead to poor approximations of the watersheds for irregular terrains [9], TIN s derived from LIDAR data, for example, are rather regular, and the computed watersheds match the real watersheds rather closely 13 New Result We present an algorithm that computes the flooding times of all terrain vertices using O(sort(X) log(x/m) + sort(n)) I/Os, where N is the size of the terrain and X is the number of pits This assumes that the watershed sizes and volumes of all basins are given and that every vertex is labelled with the pit whose watershed contains it The cost of computing

4 i V i = 315 i t i = 875 V c = 49 c h V h = 123 t c = 326 c h t h = 478 a b V b = 6 V a = 13 Wa 0 = 9 Wb 0 = 6 Wc 0 = 15 d V d = 2 e V e = 13 Wd 0 = 5 W0 e = 7 Wf 0 = 9 Wg 0 = 16 Wh 0 = 21 Wi 0 = 36 (a) g V g = 44 f V f = 3 a t a = 126 b t b = 1 d t d = 04 (b) t g = 219 g f e t f = 033 t e = 085 Figure 1: (a) A 1-d terrain with its corresponding merge tree and volumes and initial watersheds of its basins (b) The spill times of the basins and the water levels in the basins at time t = 2 α 5 α 4 α 0 α 1 β 1 α 2 β 2 α 3 β 3 β 4 β 1 β 2 β 6 β 4 α 0 α 1 α 2 α 3 α 4 α 5 α 6 β 7 β 5 β 3 β 4 β 2 β 6 β 5 α 7 β 7 β 3 α 6 β 6 β 5 α 7 β 1 α 7 β 7 (a) (b) (c) Figure 2: Figure (a) shows the merge tree M(S) of a set S = {α 0, α 1, β 1,, α 7, β 7} of basins Figure (b) illustrates the nesting of these basins using contour lines through their spill points Figure (c) shows the flow tree of basins α 7, β 1, β 2,, β 7 Every edge corresponds to a spill point and joins the two basins containing the endpoints of the trickle paths starting at this spill point (shown as dashed lines in Figure (b)) Note that the trickle paths corresponding to different edges incident to a flow tree node may end in different pits inside the basin This is the case for basin β 6 in Figure (b) O(sort(X) log(x/m)) I/Os The second phase (Section 34) then computes the flooding times of all vertices of T using O(sort(N)) I/Os 31 Computing Spill Times We assume the merge tree M is given and every node β M stores V β, W 0 β, as well as identifiers of the two elementary basins λ f (β) and λ r(β) such that λ f (β) M β and W 0 λ f (β) and W 0 λ r(β) touch in p β ; λ f (β) is needed for some data structure queries in our algorithm, while λ r(β) is the basin where water spilling from β would collect if we poured water only into β We further assume the elementary basins of T have been numbered using a preorder traversal of the flow tree F of T starting at an arbitrary root, and every elementary basin stores the preorder interval of its descendants This information can be computed from the input of our algorithm using O(sort(X)) I/Os using standard techniques Details appear in the full paper Our algorithm for computing the spill times of all basins is recursive Every recursive call takes a node β M and a list R β of all tributaries of β as input Each tributary τ R β stores its spill time t τ, its watershed W τ := Wτ tτ at time t τ, and the preorder number of an elementary basin it contains The task of the recursive call on a node s to compute the spill times of all proper descendants of n M The top-level invocation takes the root ρ of M and an empty list of tributaries as input (since ρ has no tributaries) We distinguish two cases for each recursive call If M β M, we use the algorithm in Section 32 below to solve the problem using O(sort(X)) I/Os, where X is the total input size of the invocation, that is, X := M β + R β If M β > M, we compute a heaviest path P in M β This path consists of a sequence of nodes α 0, α 1,, α k, where α 0 = β, α k is a leaf, and, for 1 i k, α i is the child of α i 1 with the bigger subtree M αi among the two children of α i 1 We use to denote the other child of α i 1 See

5 Figure 2(a) The first step of the algorithm is to compute the spill times t αi and t βi and the set R βi of tributaries of, for all 1 i k To finish the computation of spill times, we recursively invoke the algorithm on each node, 1 i k As we show in Section 33, the computation of the spill times t αi and t βi, for 1 i k, and of the lists of tributaries of the basins β 1, β 2,, β k takes O(sort(k + R β )) = O(sort(X)) I/Os, where X := M β + R β The path P can be computed using O(sort(X)) I/Os using the Euler tour technique and list ranking [6] This gives the following result Theorem 1 The spill times of all basins of a terrain T can be computed using O(sort(X) log(x/m)) I/Os, where X is the number of elementary basins of T Proof We prove in Section 32 that, given the tributaries of β, we compute the correct spill times for all basins in M n the case M β M In Section 33, we prove that, in the case M β > M, we compute the spill times of basins α i and and the tributary lists R βi, for 1 i k, correctly The correctness of the algorithm then follows by induction The I/O complexity of an invocation of the algorithm with total input size X := M β + R β and merge tree size Y := M β is given by the recurrence ( O(sort(X)) Y M T(X,Y ) = O(sort(X)) + P k, i=1 T(Xi, Yi) Y > M where Y i := M βi and X i := Y i + R βi By the definition of a tributary, every basin τ with τ M β or τ R β can P be the tributary of at most one basin Hence, we have k i=1 Xi X Furthermore, the choice of the path P as a heaviest path ensures that Y i Y/2, for all 1 i k Together, these two facts imply that the recurrence solves to T(X, Y ) = O(sort(X) log(y/m)) For the root of M, we have X = Y, that is, the overall complexity of the algorithm is O(sort(X)log(X/M)), as claimed 32 Small Basins To solve the case M β M, we provide an implementation of the algorithm of [14] using a union-find structure and a priority queue and extend it to take the tributaries of the basin nto account when computing the spill times of all sub-basins of β Before running the actual algorithm, we compute, for each tributary τ of β, the elementary sub-basin λ r(τ) of β that τ spills into; that is, τ spills into β across a saddle on the boundary of W 0 λ r(τ) These basins can be computed from the preorder numbers stored with the tributaries of β and the preorder intervals of the elementary sub-basins of β using O(sort(X)) I/Os Details appear in the full paper The algorithm maintains a set of active basins in M β, which are the basins that have not spilled yet but whose children have already spilled A basin that has spilled is finished, and a basin with at least one unfinished child is inactive The set of active basins always contains the next basin in M β to spill We maintain the set of active basins using two data structures: a priority queue Q and a unionfind structure U The priority queue stores the active basins with priorities equal to their predicted spill times these times decrease over time as we discover more tributaries of active basins The union-find structure stores the elementary basins (leaves) in M β and allows us to find, for each such basin α, the active basin α such that W 0 α is currently part of W α More precisely, a Find(α) operation returns a representative elementary sub-basin of α, and it is easy to ensure that at all times, the representative of α stores the ID of α Initially, all elementary basins are active and all other basins are inactive; the predicted spill time of an elementary basin α is t α := V α/w 0 α To allow the updating of predicted spill times, each active basin α also stores the time u α when its watershed changed last that is, the spill time of its most recent tributary as well as its residual volume V r α at time u α, which is the portion of V α left to be filled at this time Initially, V r α = V α and u α = 0, for every elementary basin α of T After initializing the algorithm s data structures as just described, we process the events in Q and R β by increasing time until Q contains only the basin β The details of each iteration are as follows Let τ be the next tributary in R β to be processed, and let α be the active basin with minimum priority in Q If t τ < t α, we remove τ from R β and process E τ as a watershed event with time t τ, as described below If t α < t τ, we remove α from Q and consider its sibling σ If σ is not finished, we process E α as a watershed event with time t α; otherwise we process it as a basin event with time t α Watershed event: A watershed event occurs when a basin α spills into a non-full basin α, thereby increasing the watershed of α To process a watershed event E α with time t, we find the active basin α that α spills into using a Find(λ r(α)) operation on U We update the information for α as V r α := V r α W α (t u α ), W α := W α + Wα, u α := t, and t α := t + V r α /W α, and then update the priority of α in Q accordingly If α R β, this finishes the processing of E α If α M β, we need to update U to reflect that water falling or flowing onto W α after time t flows into α We do this by performing a Union(λ f (α), λ r(α)) operation on U and ensuring that the representative of the resulting set of elementary basins points to α We also label the node α in M β as finished Basin event: A basin event occurs when the second of two sibling basins becomes full, leaving its parent basin in M β to fill When processing a basin event E α with time t, for some α M β, both α and its sibling σ in M β are full at time t Thus, we label α as finished in M β and mark its parent γ as active Water that flowed into α before time t now collects in γ Thus, we set W γ := W α and ensure that the representative of α now points to γ Since both α and σ are full at time t, we set Vγ r := V γ V α V σ, u γ := t, and t γ := t + Vγ r /W γ Then we insert γ into Q with priority t γ The following lemma states the correctness and I/O complexity of the above procedure Lemma 1 Let β be a basin with at most M sub-basins, and let X := M β + R β The spill times of all basins α M β can be computed using O(sort(X)) I/Os Proof To bound the I/O complexity of the procedure, we observe that Q, U, and M β fit in memory and that scanning the sorted list of tributaries R β requires us to keep only one buffer block of size B in memory Thus, apart from sorting the tributaries in R β by their spill times using O(sort(X)) I/Os, the algorithm only loads M nto memory and scans R β, which takes O(X/B) I/Os All other processing happens in memory

6 To prove the correctness of the algorithm, we consider the sequence of events E 1, E 2,, E X affecting basin β and its sub-basins, sorted by their times t 1, t 2,, t X That is, every event E i is an event E α with α M β or such that α / M β and α is a tributary of a basin α M β, in which case α is also a tributary of β and belongs to R β We use t i to denote the current predicted time of the event E i Consider the basin α such that E i = E α Then we define t i := t i if α is a tributary of β and t i := if α M β and α is inactive; if α M β and α is active, we define t i to be its priority in Q We use induction on i to prove that t i = t i when event E i is processed To do so, we prove that the following holds after processing event E i: (i) Events E 1, E 2,, E i have been processed No other events have been processed (ii) W α = W t i α, for all active basins α M β (iii) t i+1 = t i+1, while t j t j, for all j > i + 1 The base case is i = 0, setting t 0 = 0 Property (i) holds because no events have been processed yet Property (ii) holds because initially W α = Wα, 0 for every elementary basin in M β, which is exactly the set of active basins at the beginning of the procedure To see that (iii) holds, observe that t j = t j if E j = E τ, for some tributary τ of β If E j = E α, for α M β, and α is inactive, then t j = ; if α is active, then t j = V α/wα 0 t j Moreover, if E 1 = E α, for some α M β, then α has no tributaries and t 1 = V α/wα 0 = t 1 For the inductive step, consider i > 0 and assume the claim holds for all j < i By part (i) of the inductive hypothesis, E i 1 is the (i 1)st event to be processed By part (iii) of the inductive hypothesis, we have t i = t i and t j t j > t i, for all j > i, after processing E i 1 Hence, E i is the ith event to be processed This establishes (i) To prove (ii), we consider the two possible types of event E i separately Let E i = E α If E i is a watershed event, part (ii) of the inductive hypothesis implies that we identify the correct watershed W α into which α spills at time t i and, hence, that we update W α to W t i; all other watersheds α remain unchanged If E i is a basin event, α and its sibling are full at time t i, while α s parent γ has to fill Furthermore, all water flowing into γ immediately before time t i collects in α because α s sibling is already full Therefore, W t i γ = W t i α, and we correctly set W γ := W α Finally, consider (iii) We consider only events E j = E α with α an active basin in M β because, by definition, t j = t j, for every spill event E j of a tributary of β, and t j =, for every spill event E j of an inactive basin in M β It is easily verified that we update t α correctly whenever we increase W α By (ii), each event E h with h i updates W α if and only if E h = E τ, for some tributary τ of α Thus, t j t j, for all j i For j = i + 1, the spill events of the tributaries of α are a subset of E 1, E 2,, E i, as they have to happen before E α = E i+1 Thus, we have t i+1 = t i+1 after E i is processed 33 Basin Sets With Linear Merge Trees Now consider a path P = β = α 0, α 1,, α k from a merge tree node β to a descendant leaf α k, and let be the sibling of α i, for all 1 i k The basic step of our solution for the case M β > M computes the spill times of α 1, β 1, α 2, β 2,, α k, β k, as well as the lists of tributaries R β1, R β2,, R βk of the basins β 1, β 2,, β k Our algorithm for this problem operates on the flow tree F := F(S) of the set S of basins in P Note that every edge in F corresponds to the spill point connecting two basins α i and, for 1 i k, and that λ f (α i) = λ r() and λ f () = λ r(α i) in this case So the tree is easy to construct using a constant number of sorting and scanning steps of S Details appear in the full paper We call a spill event E βi confluent, as the water spilling from at time t βi spills towards α k Similarly, we call a spill event E αi diffluent, as the water spilling from α i at time t αi spills away from α k (α k is a sub-basin of α i) We define F βi to be the subtree of F rooted in, assuming α k is chosen as the root of F Our algorithm proceeds in two phases The confluent phase computes times t β 1, t β 2,, t β k These times satisfy t = t βi if E βi is a watershed event and t t βi if E βi is a basin event In addition, this phase constructs a list L βi, for each 1 i k, which is a superset of those tributaries of that belong to F βi or spill into β across a saddle on the boundary of a basin β j F βi The second, diffluent phase computes times t α 1, t α 2,, t α k and t β 1, t β 2,, t β k from the times and potential tributary lists computed in the confluent phase, and we prove that t α i = t αi and t = t βi, for all 1 i k In the process, the diffluent phase computes the list R βi of tributaries of, for every 1 i k Intuitively, this two-phase approach works because the confluent phase ensures that the spill times of all basins that are tributaries of other basins are computed correctly (since can be a tributary only if its spill event is a watershed event) A basin α i can have only confluent tributaries, since basins α 1, α 2,, α k are nested A basin can have only one diffluent tributary, namely α i, because is a subbasin of α j, for all j < i, and α i is a super-basin of α h, for all h > i Thus, by processing the basins outwards from α k that is, by decreasing index i in the diffluent phase, we can ensure that the spill times of all tributaries of a basin are known by the time the basin is processed 331 From Tributaries to Spill Times Both phases of our algorithm use the same elementary procedure to compute a predicted spill time for a basin To avoid duplication, we describe this procedure here and refer to it as procedure FindSpillTime later The input of this procedure is a basin α, a time u α, the residual volume Vα r of α at time u α, and the watershed W α = Wα uα of α at time u α In addition, we are given a priority queue Q containing a set of potential tributaries of α; each entry τ Q satisfies t τ > u α, where t τ denotes the priority of τ The output of the procedure is a predicted spill time t α of α and a list L of entries removed from Q while computing t α Initially, we set t α := u α + Vα/W r α Then we repeat the following steps to update t α until either Q is empty or the minimum entry τ in Q satisfies t τ t α We remove the minimum entry τ (which satisfies t τ < t α) from Q and append it to L Then we set Vα r := r W α(t τ u α), W α := W α+w τ, u α := t τ, and t α := u α + Vα/W r α Lemma 2 Assume at the beginning of procedure Find- SpillTime, W α and Vα r reflect the actual watershed and residual volume of α at time u α, Q contains only entries τ with t τ u α, and t τ < t α if and only if τ is a tributary of α Assume further that every tributary τ Q of α satisfies t τ = t τ Then t α t α when the procedure terminates If Q

7 contains every tributary τ of α with t τ u α, then t α = t α, and L = {τ R α t τ u α} Proof First assume for the sake of contradiction that t α < t α when the algorithm terminates Then the last element τ Q processed by procedure FindSpillTime satisfies t τ < t α, as the update of t α when processing an entry τ does not decrease t α below t τ Hence, every processed entry τ satisfies t τ < t α This implies that all processed elements are tributaries of α that satisfy u α < t τ = t τ < t α, and we process them in order Thus, after processing every tributary τ, we have W α Wα tτ, which implies that t α t α, a contradiction Now assume Q contains all tributaries of α and t α > t α when procedure FindSpillTime finishes This is possible only if we do not process all tributaries of α An unprocessed tributary τ would have to remain in Q by the end of the procedure and, thus, satisfies t τ > t α However, t τ = t τ < t α < t α, again a contradiction Finally, observe that, if t α = t α, we process exactly the elements τ Q with t τ < t α, and place them into L These are exactly the tributaries of α that satisfy t τ u α 332 Confluent Phase To implement the confluent phase of the algorithm, we root the flow tree F in α k and process its nodes in postorder With every node α F we associate a list S α R β containing all tributaries τ of β that spill into β across a saddle on the boundary of Wα 0 These lists are easy to compute using O(sort(X)) I/Os from the preorder numbers of the elementary basins associated with the tributaries, assuming that every basin in M stores the largest preorder interval of all its elementary sub-basins, which is easily computed in a preprocessing step using a bottom-up traversal in M Details appear in the full paper During the traversal of F, we ensure that visiting a node produces a priority queue Q βi that contains all basins τ that satisfy (i) τ {β j} S βj, for some β j F βi, and (ii) t τ t β h, for all β h on the path from β j to in F, inclusive At any point during the postorder traversal, there is a set of active nodes, which are nodes that have been visited already but whose parents in F have not We maintain the priority queues of all active nodes in a sequence sorted in the order these nodes are visited and represent this sequence of priority queues using a data structure that supports Insert and DeleteMin operations on the last priority queue in the sequence, the creation of a new priority queue at the end of the sequence, as well as a Meld operation, which replaces the last two priority queues in the sequence with their union In Section 4, we show that the external heap of Fadel et al [12] can be extended to process a sequence of N Insert, DeleteMin, Create, and Meld operations using O(sort(N)) I/Os The processing of a node distinguishes two cases If is a leaf, we create a new priority queue Q βi at the end of the sequence of priority queues of active nodes If is an internal node with children β j1, β j2,, β jh, the corresponding priority queues Q βj1, Q βj2,, Q βjh are at the end of the current sequence of priority queues, and we construct Q βi by melding these priority queues In both cases, we continue by inserting all elements τ S βi into Q βi, with priority t τ = t τ Then we use procedure FindSpillTime to compute t and L βi ; the input to the procedure is, Q βi, u βi := 0, W βi := Wβ 0 i, and Vβ r i := V βi Once this procedure terminates, we insert into Q βi, with priority t The confluent phase terminates when the traversal reaches the root α k of F Since E αk is a diffluent spill event, we do not compute its time in this phase of the algorithm We only construct a list L αk by collecting all elements in Q βj1, Q βj2,, Q βjh and S αk, where β j1, β j2,, β jh are the children of α k in F To establish the correctness of the confluent phase, we require a number of technical lemmas The first one characterizes the spill events of basins on the flow path between a basin and its tributary For three basins β j, β h F and a basin α {α i, }, we say β h is on the path from β j to α in F if β h is not a sub-basin of α but belongs to the path from β j to every sub-basin α F of α Lemma 3 Consider a tributary τ of a basin α {α i, }, and assume that τ {β j} S βj, for some j Then E βh is a watershed event, for every basin β h on the path from β j to α in F that is not a sub-basin of α i Proof For j > i, the lemma holds vacuously because β j is a sub-basin of α i in this case and α {α i, }; this implies that every basin β h on the path from β j to α is a sub-basin of α i For j i, assume there exists a basin β h on the path from β j to α that is not a sub-basin of α i and such that E βh is a basin event Since β h is not a sub-basin of α i, we have h i If α = α i, this implies that α is a sub-basin of α h and t α t αh t βh If α =, then h < i because β h α Thus, α is again a sub-basin of α h ; in particular, t α t βh However, since β j is a tributary of α, we have t βj < t α and, therefore, t βj < t βh in both cases For β j to be a tributary of α, there has to exist a down-hill path from β j to α at time t βj By the choice of β h, any path from β j to α has to pass through a point interior to β h first and then through p βh Since β h is not full at time t βj, no such path is down-hill at time t βj, a contradiction Our next lemma shows that, if a basin τ {β j} R βj is a tributary of a basin with β j F βi, then τ L βi, that is, τ is processed when computing t ; otherwise, if it is a tributary of α {α i, }, then it is in L γ, for some sub-basin γ of α i The first part is used to prove in Lemma 5 below that the confluent phase computes the correct spill times for all confluent watershed events The second part is used to establish the correctness of the diffluent phase, discussed in Section 333 Lemma 4 Consider a basin τ {β j} S βj with τ L γ Assume further that every basin β h in F γ satisfies t β h t βh, with equality if E βh is a watershed event If τ is a tributary of and β j F βi, then γ = Otherwise, if τ is a tributary of a basin α {α i, }, then γ is a sub-basin of α i Proof First assume τ is a tributary of and β j F βi Then E τ is a watershed event and either τ S βj or τ = β j In both cases, we have t τ = t τ, in the latter case by the assumption that t β h = t βh for every watershed event on the path from β j to γ If γ belongs to the path from β j to and γ, then we have t γ = t γ < t τ because τ is a tributary of and, hence, E γ is a watershed event However, γ α k in this case, and every element τ in L γ is removed from Q γ while computing t γ Thus, t τ < t γ, a contradiction This shows that γ is an ancestor of in F This, however, implies that t t βi > t τ = t τ Thus, the computation of t removes τ from Q βi, and τ L βi

8 Now assume that τ is a tributary of a basin α {α i, } and, if α =, that β j F βi Then, if γ belongs to the path from β j to α i, we obtain t γ = t γ < t τ = t τ on the one hand, because τ is a tributary of α and, by Lemma 3, E γ is a watershed event; on the other hand, t γ > t τ because otherwise τ / L γ Thus, we obtain a contradiction and γ is a sub-basin of α i Lemma 5 For all 1 i k, we have t t βi If E βi is a watershed event, equality holds Proof By induction on F βi The base case is F βi = 0, in which case there is nothing to prove So consider a node and assume the claim holds for all its descendants Then, by Lemma 4, every tributary τ {β j} S βj, for some descendant β j of, is inspected when computing t, and each such tributary satisfies t τ = t τ by the inductive hypothesis In particular, each such tributary belongs to Q βi when invoking procedure FindSpillTime to compute t We prove that any other element α Q βi satisfies t α t βi By Lemma 2, this implies that t t βi, with t = t βi if Q βi contains all tributaries of, which is the case if E βi is a watershed event Indeed if there was a tributary τ such that τ {β j} R βj, for some β j / F βi, the water spilling from τ into would have to flow through a point interior to α i and then through p αi = p βi Since t τ < t βi and E βi is a watershed event, α i is not full at time t τ, that is, the path from τ to cannot be down-hill at time t τ, a contradiction So consider an element α Q βi, and assume that α {β j} S βj, for some β j F βi, and t α < t βi We prove that α is a tributary of We have t α t β h, for all β h on the path from β j to, excluding, because otherwise β h would have removed α from Q βh when computing t β h Since, by the inductive hypothesis, t βh t β h, this implies that t βh < t βi Also by the inductive hypothesis, we have t α t α If t α t βh, for all β h on the path from β j to, then α is a tributary of If t α < t βh t α, for some β h, then, by the inductive hypothesis, E α is not a watershed event In particular, α = β j, for some j < i, and t βi < t αj t βj = t α because is a sub-basin of α j This contradicts our assumption that t α t α < t βi 333 Diffluent Phase In the diffluent phase, we process the basins in M n the order α k, β k, α k 1, β k 1,, α 1, β 1 We initialize a priority queue Q to contain all events in L αk Then we repeat the following steps for i = k, k 1,, 1: First we compute t α i If i = k, we do this by invoking procedure FindSpillTime with arguments α k, Q, u αk := 0, W αk := Wα 0 k, and Vα r k := V αk If i < k, we invoke the procedure with arguments α i, Q, u αi := max(t α i+1, t +1 ), W αi := max(w αi+1, W βi+1 ), and Vα r i := V αi V αi+1 V βi+1 We place the elements of Q processed by procedure Find- SpillTime into a list R α i Then we compute t To do so, we insert α i (with priority t α i ) and the events in L βi into Q and invoke procedure FindSpillTime with arguments, Q, u βi := 0, W βi := Wβ 0 i, and Vβ r i := V βi We place the elements of Q processed by procedure FindSpillTime into a list R The computation of t α i and t may not process all elements in Q, and some of these elements may be sub-basins of α i These elements should be ignored in subsequent computations, as they cannot be tributaries of any α j or β j with j < i To ensure this, we augment the above procedure to ignore in iteration i all elements α j or β j in Q with j > i (Note that we ignore only these basins, not the elements of S αk or S βj with j > i) Lemma 6 For all 1 i k, we have t αi = t α i, t βi = t R βi = R, and R αi = R α k S k 1 j=i R β j Proof By induction on i For i = k, Lemmas 4 and 5 imply that every tributary τ of α k belongs to L αk and satisfies t τ = t τ Any other basin α L αk satisfies t α t α, and the same arguments as in the proof of Lemma 5 show that t α t αk in this case Hence, by Lemma 2, the first iteration computes t α k = t αk and processes only tributaries of α k, that is, R α k = R αk Now consider β k By Lemmas 4 and 5, every tributary of β k belongs to {α k } L αk L βk, and we have just argued that the computation of α k processes only tributaries of α k Hence, Q contains all tributaries of β k We have just argued that t α k = t αk and, by Lemma 5, any other tributary τ of β k satisfies t τ = t τ For an element α Q that is not a tributary of β k, the same arguments as in the proof of Lemma 5 show that t α t βk (However, we have to distinguish the cases β j F βk and β j / F βk, where α {β j} R βj ) Thus, by Lemma 2, we compute t β k = t βk and R β k = R βk Now assume that i < k and that the inductive hypothesis holds for all j > i By Lemmas 4 and 5, every tributary τ of α i is contained in L := L αk S k j=i+1 L β j = Q S k j=i+1 (R α j R β j ) and satisfies t τ = t τ Using the arguments from the proof of Lemma 5 again, every α L that is not a tributary of α i satisfies t α t αi Moreover, the elements of L \ Q are exactly the tributaries of α i+1 and +1, that is, the tributaries τ of α i that satisfy t τ < u αi Hence, by Lemma 2, we compute t α i = t αi and R α i to contain exactly the tributaries τ of α i that satisfy t τ u αi Thus, R α = R α i S k j=i+1 (R α j R β j ) The correctness is identical to that proof for the computation of t and R for β k Lemma 7 The computation of the spill times t αi and t βi and of the tributary lists R βi, for all 1 i k, takes O(sort(X)) I/Os Proof The correctness of the algorithm is established by Lemma 6 Its I/O-complexity is established as follows: Both phases of the algorithm perform O(X) priority queue operations, as every event is inserted and removed from a priority queue only once in each of the two phases and the confluent phase performs at most X Meld operations By Theorem 3 in Section 4, these priority queue operations cost O(sort(X)) I/Os Apart from this, the algorithm traverses trees F and M(S) once each and scans lists associated with the tree nodes of total length O(X) After arranging the tree nodes and list elements in the right order, using a sorting step, the scanning of these lists takes O(X/B) I/Os 34 Computing the Flooding Times of All Terrain Vertices The terrain vertices that are not spill points of basins can be seen as partitioning the basins of T into sub-basins as follows For each terrain vertex v, let α v be the smallest basin that contains v For a basin β, let V (β) be the set of terrain vertices with α v = β Each vertex v V (β) defines a sub-basin β T (v) of β containing the portion of β below elevation T (v) Let v 1, v 2,, v k be the vertices in V (β) sorted by,

9 increasing elevation Then the water from every tributary in R first collects in β T (v1 ); once β T (v1 ) is full, the water collects in β T (v2 ), and so on This is just a simplified version of the diffluent flow computation in Section 333, and computing the spill times of basins β T (v1 ), β T (v2 ),, β T (vk ) that is, the flooding times of vertices v 1, v 2,, v k from the list R β takes O(sort(N)) I/Os in total for all basins of T The computation of the sets V (β) can be incorporated in the computation of the basins of T without increasing the I/O complexity of that phase Details appear in the full paper Thus, we have the following result Theorem 2 The flooding times of all vertices of a terrain T with N vertices and X pits can be computed using O(sort(X) log(x/m) + sort(n)) I/Os, given the watersheds and volumes of all basins of T 4 A MELDABLE PRIORITY QUEUE In this section, we discuss how to extend the external heap of Fadel et al [12] to obtain a meldable priority queue that can be used in the procedure in Section 332 This structure maintains a sequence of priority queues Q 1, Q 2,, Q k under Create, Insert, DeleteMin, and Meld operations Given the current sequence Q 1, Q 2,, Q k, a Create operation creates a new, empty priority queue Q k+1 and appends it to the end of the sequence; an Insert(x) operation inserts element x into Q k ; a DeleteMin operation deletes and returns the minimum element in Q k ; a Meld operation replaces Q k 1 and Q k with a new priority queue Q k 1 := Q k 1 Q k containing the elements from Q k 1 and Q k We prove the following result Theorem 3 There exists a linear-space data structure that uses O(sort(N)) I/Os to process a sequence of N Create, Insert, DeleteMin, and Meld operations 41 The Structure Each priority queue Q i is represented as an external heap similar to the one presented in [12] The underlying structure is a rooted tree whose internal nodes have between a = M/(4B) and b = M/B children There are two types of leaves: regular leaves are on the lowest level of the tree, which we call the leaf level; leaves above the leaf level are special Every node v has an associated buffer X v If v is an internal node, X v has size M If v is a leaf, there is no bound on the size of X v The elements in X v are sorted, and the buffer contents of adjacent nodes satisfy the heap property: for every node v with parent u and every pair of elements x X u and y X v, we have x y To support Insert and DeleteMin operations efficiently, every priority queue Q i is equipped with an insert/delete buffer or I/D buffer for short This buffer is capable of holding up to M elements and stores the minimum elements in Q i as well as newly inserted elements To distinguish newly inserted elements in Q i s I/D buffer from minimum elements, Q i has an associated priority p i, which is the minimum priority of the elements stored in the external part of Q i The I/D buffers of priority queues Q 1, Q 2,, Q k are kept on a buffer stack, with the buffer of Q 1 at the bottom and the buffer of Q k at the top The I/D buffers of consecutive priority queues are separated by special marker elements We always keep the topmost M elements of this stack in memory, which ensures in particular that the I/D buffer of Q k is in memory 42 Operations Create To create a new priority queue Q k+1, a Create operation pushes a new marker element onto the buffer stack Insert An Insert operation on Q k adds the inserted element x to Q k s I/D buffer If the buffer now contains M elements, we Flush the buffer as described below Then we Fill the I/D buffer with the M/2 minimum elements in Q k and update p k accordingly DeleteMin A DeleteMin operation on Q k returns the minimum element in Q k s I/D buffer Before doing so, however, it checks whether the I/D buffer of Q k is empty or its minimum element is greater than p k If so, it Flushes the I/D buffer and then Fills it with the M/2 minimum elements in Q k Meld A Meld operation behaves differently depending on the structure of the two involved priority queues, Q k 1 and Q k During different stages of its life time, a priority queue may have no external portion because all its elements fit in the I/D buffer If at least one of the two priority queues, say Q k, has no external portion, we destroy its I/D buffer, and Insert its elements into Q k 1 If both priority queues have external portions, we Flush and destroy their I/D buffers, Merge their two trees as described below, and then Fill the I/D buffer of the merged priority queue Q k 1 := Q k 1 Q k with the M/2 minimum elements in Q k 1 Flush A Flush operation of an I/D buffer containing K elements creates a new leaf l at the leaf level and stores these K elements in l Then it applies a Heapify operation to the path from l to the root to restore the heap property If the addition of l increased the degree of l s parent p to M/B + 1, we split p into two nodes p and p, each with half of p s children We associate the buffer of p with p and populate the buffer of p with the M smallest elements in its subtree using a Fill operation If this split increases the degree of p s parent to M/B + 1, we apply this rebalancing procedure recursively until we reach the root If the root has degree greater than M/B, we split it into two nodes with a new parent Fill A Fill operation applied to a node v repeatedly takes the smallest element stored in the buffers of v s children, removes it from the corresponding child s buffer and stores it in v s buffer This continues until M elements have been collected in v s buffer or there are no elements left in the buffers of v s children Whenever a child buffer runs empty while filling v s buffer, its buffer is filled recursively before continuing to fill v s buffer Once there are no more elements left in a node v s subtree, we mark v as exhausted, in order to avoid repeatedly trying to fill v with elements More precisely, a node is marked as exhausted once its buffer becomes empty and all its children are exhausted To fill the I/D buffer of a priority queue Q k with the M/2 smallest elements in Q k, we transfer the M/2 smallest elements from the root of Q k into the I/D buffer, Filling the root s buffer whenever it runs empty If the root becomes exhausted in the process, we delete the entire external portion of the priority queue

### Dictionary: an abstract data type

2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees

### Shortest paths with negative lengths

Chapter 8 Shortest paths with negative lengths In this chapter we give a linear-space, nearly linear-time algorithm that, given a directed planar graph G with real positive and negative lengths, but no

### Flow on terrains (II)

Algorithms for GIS Flow on terrains (II) Laura Toma Bowdoin College Overview Flow on grid terrains Dealing with flat areas From flow accumulation to river network and watersheds Pfafstetter watershed and

### CS361 Homework #3 Solutions

CS6 Homework # Solutions. Suppose I have a hash table with 5 locations. I would like to know how many items I can store in it before it becomes fairly likely that I have a collision, i.e., that two items

### Dictionary: an abstract data type

2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees

### Cache-Oblivious Range Reporting With Optimal Queries Requires Superlinear Space

Cache-Oblivious Range Reporting With Optimal Queries Requires Superlinear Space Peyman Afshani MADALGO Dept. of Computer Science University of Aarhus IT-Parken, Aabogade 34 DK-8200 Aarhus N Denmark peyman@madalgo.au.dk

### Advanced Implementations of Tables: Balanced Search Trees and Hashing

Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the

### CS60007 Algorithm Design and Analysis 2018 Assignment 1

CS60007 Algorithm Design and Analysis 2018 Assignment 1 Palash Dey and Swagato Sanyal Indian Institute of Technology, Kharagpur Please submit the solutions of the problems 6, 11, 12 and 13 (written in

### The Complexity of Constructing Evolutionary Trees Using Experiments

The Complexity of Constructing Evolutionary Trees Using Experiments Gerth Stlting Brodal 1,, Rolf Fagerberg 1,, Christian N. S. Pedersen 1,, and Anna Östlin2, 1 BRICS, Department of Computer Science, University

### Divide-and-Conquer Algorithms Part Two

Divide-and-Conquer Algorithms Part Two Recap from Last Time Divide-and-Conquer Algorithms A divide-and-conquer algorithm is one that works as follows: (Divide) Split the input apart into multiple smaller

### ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October

ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October Contents 1 Cartesian Tree. Definition. 1 2 Cartesian Tree. Construction 1 3 Cartesian Tree. Operations. 2 3.1 Split............................................

### On improving matchings in trees, via bounded-length augmentations 1

On improving matchings in trees, via bounded-length augmentations 1 Julien Bensmail a, Valentin Garnero a, Nicolas Nisse a a Université Côte d Azur, CNRS, Inria, I3S, France Abstract Due to a classical

### Dominating Set Counting in Graph Classes

Dominating Set Counting in Graph Classes Shuji Kijima 1, Yoshio Okamoto 2, and Takeaki Uno 3 1 Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan kijima@inf.kyushu-u.ac.jp

### CMSC 451: Lecture 7 Greedy Algorithms for Scheduling Tuesday, Sep 19, 2017

CMSC CMSC : Lecture Greedy Algorithms for Scheduling Tuesday, Sep 9, 0 Reading: Sects.. and. of KT. (Not covered in DPV.) Interval Scheduling: We continue our discussion of greedy algorithms with a number

### Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn Priority Queue / Heap Stores (key,data) pairs (like dictionary) But, different set of operations: Initialize-Heap: creates new empty heap

### RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION MAGNUS BORDEWICH, KATHARINA T. HUBER, VINCENT MOULTON, AND CHARLES SEMPLE Abstract. Phylogenetic networks are a type of leaf-labelled,

### Graph Theorizing Peg Solitaire. D. Paul Hoilman East Tennessee State University

Graph Theorizing Peg Solitaire D. Paul Hoilman East Tennessee State University December 7, 00 Contents INTRODUCTION SIMPLE SOLVING CONCEPTS 5 IMPROVED SOLVING 7 4 RELATED GAMES 5 5 PROGENATION OF SOLVABLE

### Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs

Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs Krishnendu Chatterjee Rasmus Ibsen-Jensen Andreas Pavlogiannis IST Austria Abstract. We consider graphs with n nodes together

### IS 709/809: Computational Methods in IS Research Fall Exam Review

IS 709/809: Computational Methods in IS Research Fall 2017 Exam Review Nirmalya Roy Department of Information Systems University of Maryland Baltimore County www.umbc.edu Exam When: Tuesday (11/28) 7:10pm

### HW Graph Theory SOLUTIONS (hbovik) - Q

1, Diestel 3.5: Deduce the k = 2 case of Menger s theorem (3.3.1) from Proposition 3.1.1. Let G be 2-connected, and let A and B be 2-sets. We handle some special cases (thus later in the induction if these

### CSE 202 Homework 4 Matthias Springer, A

CSE 202 Homework 4 Matthias Springer, A99500782 1 Problem 2 Basic Idea PERFECT ASSEMBLY N P: a permutation P of s i S is a certificate that can be checked in polynomial time by ensuring that P = S, and

### On the Complexity of Partitioning Graphs for Arc-Flags

Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 17, no. 3, pp. 65 99 (013) DOI: 10.7155/jgaa.0094 On the Complexity of Partitioning Graphs for Arc-Flags Reinhard Bauer Moritz Baum Ignaz

### Assignment 5: Solutions

Comp 21: Algorithms and Data Structures Assignment : Solutions 1. Heaps. (a) First we remove the minimum key 1 (which we know is located at the root of the heap). We then replace it by the key in the position

### Advanced Analysis of Algorithms - Midterm (Solutions)

Advanced Analysis of Algorithms - Midterm (Solutions) K. Subramani LCSEE, West Virginia University, Morgantown, WV {ksmani@csee.wvu.edu} 1 Problems 1. Solve the following recurrence using substitution:

### Cographs; chordal graphs and tree decompositions

Cographs; chordal graphs and tree decompositions Zdeněk Dvořák September 14, 2015 Let us now proceed with some more interesting graph classes closed on induced subgraphs. 1 Cographs The class of cographs

### University of Toronto Department of Electrical and Computer Engineering. Final Examination. ECE 345 Algorithms and Data Structures Fall 2016

University of Toronto Department of Electrical and Computer Engineering Final Examination ECE 345 Algorithms and Data Structures Fall 2016 Print your first name, last name, UTORid, and student number neatly

### Assignment 4. CSci 3110: Introduction to Algorithms. Sample Solutions

Assignment 4 CSci 3110: Introduction to Algorithms Sample Solutions Question 1 Denote the points by p 1, p 2,..., p n, ordered by increasing x-coordinates. We start with a few observations about the structure

### Classical Complexity and Fixed-Parameter Tractability of Simultaneous Consecutive Ones Submatrix & Editing Problems

Classical Complexity and Fixed-Parameter Tractability of Simultaneous Consecutive Ones Submatrix & Editing Problems Rani M. R, Mohith Jagalmohanan, R. Subashini Binary matrices having simultaneous consecutive

### Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

Search Trees Chapter 1 < 6 2 > 1 4 = 8 9-1 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 2 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 3 - Binary Search Trees Ø A binary search

### On Two Class-Constrained Versions of the Multiple Knapsack Problem

On Two Class-Constrained Versions of the Multiple Knapsack Problem Hadas Shachnai Tami Tamir Department of Computer Science The Technion, Haifa 32000, Israel Abstract We study two variants of the classic

### FIBONACCI HEAPS. preliminaries insert extract the minimum decrease key bounding the rank meld and delete. Copyright 2013 Kevin Wayne

FIBONACCI HEAPS preliminaries insert extract the minimum decrease key bounding the rank meld and delete Copyright 2013 Kevin Wayne http://www.cs.princeton.edu/~wayne/kleinberg-tardos Last updated on Sep

### Maximising the number of induced cycles in a graph

Maximising the number of induced cycles in a graph Natasha Morrison Alex Scott April 12, 2017 Abstract We determine the maximum number of induced cycles that can be contained in a graph on n n 0 vertices,

### Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

CS17 Integrated Introduction to Computer Science Klein Contents Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 1 Tree definitions 1 2 Analysis of mergesort using a binary tree 1 3 Analysis of

### FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016)

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016) The final exam will be on Thursday, May 12, from 8:00 10:00 am, at our regular class location (CSI 2117). It will be closed-book and closed-notes, except

### Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

Search Trees < 6 2 > 1 4 = 8 9-1 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 2 - Learning Outcomes Ø From this lecture, you should be able to: q Define the properties of a binary search

### Faster Algorithms for some Optimization Problems on Collinear Points

Faster Algorithms for some Optimization Problems on Collinear Points Ahmad Biniaz Prosenjit Bose Paz Carmi Anil Maheshwari J. Ian Munro Michiel Smid June 29, 2018 Abstract We propose faster algorithms

### Algorithms for pattern involvement in permutations

Algorithms for pattern involvement in permutations M. H. Albert Department of Computer Science R. E. L. Aldred Department of Mathematics and Statistics M. D. Atkinson Department of Computer Science D.

### Priority queues implemented via heaps

Priority queues implemented via heaps Comp Sci 1575 Data s Outline 1 2 3 Outline 1 2 3 Priority queue: most important first Recall: queue is FIFO A normal queue data structure will not implement a priority

### 1 Basic Definitions. 2 Proof By Contradiction. 3 Exchange Argument

1 Basic Definitions A Problem is a relation from input to acceptable output. For example, INPUT: A list of integers x 1,..., x n OUTPUT: One of the three smallest numbers in the list An algorithm A solves

### Fly Cheaply: On the Minimum Fuel Consumption Problem

Journal of Algorithms 41, 330 337 (2001) doi:10.1006/jagm.2001.1189, available online at http://www.idealibrary.com on Fly Cheaply: On the Minimum Fuel Consumption Problem Timothy M. Chan Department of

### Efficient Reassembling of Graphs, Part 1: The Linear Case

Efficient Reassembling of Graphs, Part 1: The Linear Case Assaf Kfoury Boston University Saber Mirzaei Boston University Abstract The reassembling of a simple connected graph G = (V, E) is an abstraction

### CS 161: Design and Analysis of Algorithms

CS 161: Design and Analysis of Algorithms Greedy Algorithms 3: Minimum Spanning Trees/Scheduling Disjoint Sets, continued Analysis of Kruskal s Algorithm Interval Scheduling Disjoint Sets, Continued Each

### arxiv: v1 [cs.dm] 26 Apr 2010

A Simple Polynomial Algorithm for the Longest Path Problem on Cocomparability Graphs George B. Mertzios Derek G. Corneil arxiv:1004.4560v1 [cs.dm] 26 Apr 2010 Abstract Given a graph G, the longest path

### 1 Sequences and Summation

1 Sequences and Summation A sequence is a function whose domain is either all the integers between two given integers or all the integers greater than or equal to a given integer. For example, a m, a m+1,...,

### CSE 591 Homework 3 Sample Solutions. Problem 1

CSE 591 Homework 3 Sample Solutions Problem 1 (a) Let p = max{1, k d}, q = min{n, k + d}, it suffices to find the pth and qth largest element of L and output all elements in the range between these two

### Lecture 1 : Data Compression and Entropy

CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

### Total Ordering on Subgroups and Cosets

Total Ordering on Subgroups and Cosets Alexander Hulpke Department of Mathematics Colorado State University 1874 Campus Delivery Fort Collins, CO 80523-1874 hulpke@math.colostate.edu Steve Linton Centre

### Automorphism groups of wreath product digraphs

Automorphism groups of wreath product digraphs Edward Dobson Department of Mathematics and Statistics Mississippi State University PO Drawer MA Mississippi State, MS 39762 USA dobson@math.msstate.edu Joy

### Entropy-Preserving Cuttings and Space-Efficient Planar Point Location

Appeared in ODA 2001, pp. 256-261. Entropy-Preserving Cuttings and pace-efficient Planar Point Location unil Arya Theocharis Malamatos David M. Mount Abstract Point location is the problem of preprocessing

### CPSC 320 Sample Final Examination December 2013

CPSC 320 Sample Final Examination December 2013 [10] 1. Answer each of the following questions with true or false. Give a short justification for each of your answers. [5] a. 6 n O(5 n ) lim n + This is

### 1 Efficient Transformation to CNF Formulas

1 Efficient Transformation to CNF Formulas We discuss an algorithm, due to Tseitin [?], which efficiently transforms an arbitrary Boolean formula φ to a CNF formula ψ such that ψ has a model if and only

### Undirected Graphs. V = { 1, 2, 3, 4, 5, 6, 7, 8 } E = { 1-2, 1-3, 2-3, 2-4, 2-5, 3-5, 3-7, 3-8, 4-5, 5-6 } n = 8 m = 11

Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. V = {, 2, 3,,,, 7, 8 } E

Chapter 4 Greedy Algorithms Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved. 1 4.1 Interval Scheduling Interval Scheduling Interval scheduling. Job j starts at s j and

### Induced Saturation Number

Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2012 Induced Saturation Number Jason James Smith Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd

### Santa Claus Schedules Jobs on Unrelated Machines

Santa Claus Schedules Jobs on Unrelated Machines Ola Svensson (osven@kth.se) Royal Institute of Technology - KTH Stockholm, Sweden March 22, 2011 arxiv:1011.1168v2 [cs.ds] 21 Mar 2011 Abstract One of the

### 8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm

8 Priority Queues 8 Priority Queues A Priority Queue S is a dynamic set data structure that supports the following operations: S. build(x 1,..., x n ): Creates a data-structure that contains just the elements

Chapter 11 Approximation Algorithms Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights reserved. 1 Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should

### A Decidable Class of Planar Linear Hybrid Systems

A Decidable Class of Planar Linear Hybrid Systems Pavithra Prabhakar, Vladimeros Vladimerou, Mahesh Viswanathan, and Geir E. Dullerud University of Illinois at Urbana-Champaign. Abstract. The paper shows

### Asymptotic redundancy and prolixity

Asymptotic redundancy and prolixity Yuval Dagan, Yuval Filmus, and Shay Moran April 6, 2017 Abstract Gallager (1978) considered the worst-case redundancy of Huffman codes as the maximum probability tends

### Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Complexity Theory Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Reinhard

### Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität

### Algorithm Design and Analysis

Algorithm Design and Analysis LECTURE 6 Greedy Algorithms Interval Scheduling Interval Partitioning Scheduling to Minimize Lateness Sofya Raskhodnikova S. Raskhodnikova; based on slides by E. Demaine,

### Analysis of Algorithms - Midterm (Solutions)

Analysis of Algorithms - Midterm (Solutions) K Subramani LCSEE, West Virginia University, Morgantown, WV {ksmani@cseewvuedu} 1 Problems 1 Recurrences: Solve the following recurrences exactly or asymototically

### Disjoint-Set Forests

Disjoint-Set Forests Thanks for Showing Up! Outline for Today Incremental Connectivity Maintaining connectivity as edges are added to a graph. Disjoint-Set Forests A simple data structure for incremental

### 1 Approximate Quantiles and Summaries

CS 598CSC: Algorithms for Big Data Lecture date: Sept 25, 2014 Instructor: Chandra Chekuri Scribe: Chandra Chekuri Suppose we have a stream a 1, a 2,..., a n of objects from an ordered universe. For simplicity

### 1 Some loose ends from last time

Cornell University, Fall 2010 CS 6820: Algorithms Lecture notes: Kruskal s and Borůvka s MST algorithms September 20, 2010 1 Some loose ends from last time 1.1 A lemma concerning greedy algorithms and

### Heaps and Priority Queues

Heaps and Priority Queues Motivation Situations where one has to choose the next most important from a collection. Examples: patients in an emergency room, scheduling programs in a multi-tasking OS. Need

### Reverse mathematics and marriage problems with unique solutions

Reverse mathematics and marriage problems with unique solutions Jeffry L. Hirst and Noah A. Hughes January 28, 2014 Abstract We analyze the logical strength of theorems on marriage problems with unique

### Pattern Popularity in 132-Avoiding Permutations

Pattern Popularity in 132-Avoiding Permutations The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Rudolph,

### Analysis of tree edit distance algorithms

Analysis of tree edit distance algorithms Serge Dulucq 1 and Hélène Touzet 1 LaBRI - Universit Bordeaux I 33 0 Talence cedex, France Serge.Dulucq@labri.fr LIFL - Universit Lille 1 9 6 Villeneuve d Ascq

### Factorised Representations of Query Results: Size Bounds and Readability

Factorised Representations of Query Results: Size Bounds and Readability Dan Olteanu and Jakub Závodný Department of Computer Science University of Oxford {dan.olteanu,jakub.zavodny}@cs.ox.ac.uk ABSTRACT

### Popular Mechanics, 1954

Introduction to GIS Popular Mechanics, 1954 1986 \$2,599 1 MB of RAM 2017, \$750, 128 GB memory, 2 GB of RAM Computing power has increased exponentially over the past 30 years, Allowing the existence of

### On shredders and vertex connectivity augmentation

On shredders and vertex connectivity augmentation Gilad Liberman The Open University of Israel giladliberman@gmail.com Zeev Nutov The Open University of Israel nutov@openu.ac.il Abstract We consider the

### Theory of Computation

Theory of Computation (Feodor F. Dragan) Department of Computer Science Kent State University Spring, 2018 Theory of Computation, Feodor F. Dragan, Kent State University 1 Before we go into details, what

### Functional Data Structures

Functional Data Structures with Isabelle/HOL Tobias Nipkow Fakultät für Informatik Technische Universität München 2017-2-3 1 Part II Functional Data Structures 2 Chapter 1 Binary Trees 3 1 Binary Trees

### Recap & Interval Scheduling

Lecture 2 Recap & Interval Scheduling Supplemental reading in CLRS: Section 6.; Section 4.4 2. Recap of Median Finding Like MERGE-SORT, the median-of-medians algorithm SELECT calls itself recursively,

### Connectivity and tree structure in finite graphs arxiv: v5 [math.co] 1 Sep 2014

Connectivity and tree structure in finite graphs arxiv:1105.1611v5 [math.co] 1 Sep 2014 J. Carmesin R. Diestel F. Hundertmark M. Stein 20 March, 2013 Abstract Considering systems of separations in a graph

### CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 )

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26 Homework #2 ( Due: Nov 8 ) Task 1. [ 80 Points ] Average Case Analysis of Median-of-3 Quicksort Consider the median-of-3 quicksort algorithm

### Finding and Approximating Top-k Answers in Keyword Proximity Search

Finding and Approximating Top-k Answers in Keyword Proximity Search [Extended Abstract] Benny Kimelfeld and Yehoshua Sagiv The Selim and Rachel Benin School of Engineering and Computer Science, The Hebrew

### Preliminaries. Graphs. E : set of edges (arcs) (Undirected) Graph : (i, j) = (j, i) (edges) V = {1, 2, 3, 4, 5}, E = {(1, 3), (3, 2), (2, 4)}

Preliminaries Graphs G = (V, E), V : set of vertices E : set of edges (arcs) (Undirected) Graph : (i, j) = (j, i) (edges) 1 2 3 5 4 V = {1, 2, 3, 4, 5}, E = {(1, 3), (3, 2), (2, 4)} 1 Directed Graph (Digraph)

### RECOVERY OF NON-LINEAR CONDUCTIVITIES FOR CIRCULAR PLANAR GRAPHS

RECOVERY OF NON-LINEAR CONDUCTIVITIES FOR CIRCULAR PLANAR GRAPHS WILL JOHNSON Abstract. We consider the problem of recovering nonlinear conductances in a circular planar graph. If the graph is critical

### Topics in Approximation Algorithms Solution for Homework 3

Topics in Approximation Algorithms Solution for Homework 3 Problem 1 We show that any solution {U t } can be modified to satisfy U τ L τ as follows. Suppose U τ L τ, so there is a vertex v U τ but v L

### Blame and Coercion: Together Again for the First Time

Blame and Coercion: Together Again for the First Time Supplementary Material Jeremy Siek Indiana University, USA jsiek@indiana.edu Peter Thiemann Universität Freiburg, Germany thiemann@informatik.uni-freiburg.de

### A Tight Lower Bound for Top-Down Skew Heaps

A Tight Lower Bound for Top-Down Skew Heaps Berry Schoenmakers January, 1997 Abstract Previously, it was shown in a paper by Kaldewaij and Schoenmakers that for topdown skew heaps the amortized number

Complexity Tradeoffs for Read and Update Operations Danny Hendler Department of Computer-Science Ben-Gurion University hendlerd@cs.bgu.ac.il Vitaly Khait Department of Computer-Science Ben-Gurion University

### Advances in processor, memory, and communication technologies

Discrete and continuous min-energy schedules for variable voltage processors Minming Li, Andrew C. Yao, and Frances F. Yao Department of Computer Sciences and Technology and Center for Advanced Study,

### Contents Lecture 4. Greedy graph algorithms Dijkstra s algorithm Prim s algorithm Kruskal s algorithm Union-find data structure with path compression

Contents Lecture 4 Greedy graph algorithms Dijkstra s algorithm Prim s algorithm Kruskal s algorithm Union-find data structure with path compression Jonas Skeppstedt (jonasskeppstedt.net) Lecture 4 2018

### Nowhere 0 mod p dominating sets in multigraphs

Nowhere 0 mod p dominating sets in multigraphs Raphael Yuster Department of Mathematics University of Haifa at Oranim Tivon 36006, Israel. e-mail: raphy@research.haifa.ac.il Abstract Let G be a graph with

### Online Sorted Range Reporting and Approximating the Mode

Online Sorted Range Reporting and Approximating the Mode Mark Greve Progress Report Department of Computer Science Aarhus University Denmark January 4, 2010 Supervisor: Gerth Stølting Brodal Online Sorted

### Extracting a Largest Redundancy-Free XML Storage Structure from an Acyclic Hypergraph in Polynomial Time

Extracting a Largest Redundancy-Free XML Storage Structure from an Acyclic Hypergraph in Polynomial Time Wai Yin Mok, Joseph Fong and David W. Embley Abstract Given a hypergraph and a set of embedded functional

### Greedy Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 10

Greedy Algorithms CSE 101: Design and Analysis of Algorithms Lecture 10 CSE 101: Design and analysis of algorithms Greedy algorithms Reading: Kleinberg and Tardos, sections 4.1, 4.2, and 4.3 Homework 4

### Approximation Algorithms for Asymmetric TSP by Decomposing Directed Regular Multigraphs

Approximation Algorithms for Asymmetric TSP by Decomposing Directed Regular Multigraphs Haim Kaplan Tel-Aviv University, Israel haimk@post.tau.ac.il Nira Shafrir Tel-Aviv University, Israel shafrirn@post.tau.ac.il

### arxiv: v1 [math.co] 22 Jan 2013

NESTED RECURSIONS, SIMULTANEOUS PARAMETERS AND TREE SUPERPOSITIONS ABRAHAM ISGUR, VITALY KUZNETSOV, MUSTAZEE RAHMAN, AND STEPHEN TANNY arxiv:1301.5055v1 [math.co] 22 Jan 2013 Abstract. We apply a tree-based

### Data Structures and Algorithms " Search Trees!!

Data Structures and Algorithms " Search Trees!! Outline" Binary Search Trees! AVL Trees! (2,4) Trees! 2 Binary Search Trees! "! < 6 2 > 1 4 = 8 9 Ordered Dictionaries" Keys are assumed to come from a total

### DISTINGUISHING PARTITIONS AND ASYMMETRIC UNIFORM HYPERGRAPHS

DISTINGUISHING PARTITIONS AND ASYMMETRIC UNIFORM HYPERGRAPHS M. N. ELLINGHAM AND JUSTIN Z. SCHROEDER In memory of Mike Albertson. Abstract. A distinguishing partition for an action of a group Γ on a set

### REVIEW QUESTIONS. Chapter 1: Foundations: Sets, Logic, and Algorithms

REVIEW QUESTIONS Chapter 1: Foundations: Sets, Logic, and Algorithms 1. Why can t a Venn diagram be used to prove a statement about sets? 2. Suppose S is a set with n elements. Explain why the power set

### INF2220: algorithms and data structures Series 1

Universitetet i Oslo Institutt for Informatikk I. Yu, D. Karabeg INF2220: algorithms and data structures Series 1 Topic Function growth & estimation of running time, trees (Exercises with hints for solution)

### Covering Linear Orders with Posets

Covering Linear Orders with Posets Proceso L. Fernandez, Lenwood S. Heath, Naren Ramakrishnan, and John Paul C. Vergara Department of Information Systems and Computer Science, Ateneo de Manila University,