Comp 21: Algorithms and Data Structures Assignment : Solutions 1. Heaps. (a) First we remove the minimum key 1 (which we know is located at the root of the heap). We then replace it by the key in the position n of the heap. That is, we move key to the root. This gives the following tree: 1 12 1 The heap property is now violated at the root. So we apply heapifydown from the root to reestablish the heap property. This process, showing how key moves down the tree, is illustrated below. 1 12 1 1 12 1 1
1 12 1 (b) The set W of keys of value at most k can be found in O( W ) time. The reason for this is that the keys with low value are at the top of the heap and the keys with high value are near the bottom of the heap. In particular, the keys along ant path from the root to a leaf are increasing (or nondecreasing). This means we can find W using depth first search (or breadth first search) on the heap. We start at the root. If the key of the root node is at most k then we return the node and recurse on the children of the root. If the key is greater than k, then we do not need to recurse on the children (as their keys and the keys of their descendent must then also be larger than k). Now let s examine the running time. The DFS/BFS algorithm examines each node in W exactly once. In addition, it must examine the children of every node in W (to either find more nodes in W or to determine that the keys are now larger than k and we do not need to search in the corresponding subtree). Since we have a binary heap, each node in W has at most two children so the total number of nodes we examine in O( W ). 2
2. Heaps. (a) A heap with n vertices has depth Θ(log d n), so heapify-up takes time O(log d n). (b) heapify-down takes time O(d log d n) as, at each step, we need to compare the parent key to the keys of its d children. (c) Recall Dijkstra s algorithms may require n applications of extract-min and m applications of key-decrease. The former uses heapify-down and the latter uses heapify-up. Thus the total run time is O(n d log d n + m log d n) (d) The quickest implementation occurs if we equate n d log d n and m log d n. Thus we want n d = m. Since m = n 1+ɛ, we set d = n ɛ. Therefore n = d 1 ɛ. But then log d n = 1 ɛ. The total run time is then O( 1 ɛ m).
. Hash Tables. (a) We compute the hash function h(k) = k + 2 (mod ) for each value key k and insert it to the corresponding slot in the Hash table. h() = ( + 2) mod = 2 h() = ( + 2) mod = 2 h(2) = ( 2 + 2) mod = 1 h() = ( + 2) mod = 1 h() = ( + 2) mod = h(1) = ( 1 + 2) mod = h(20) = ( 20 + 2) mod = 6 Each slot in the Hash table maintains a list (e.g. a linked list) of all the keys that hash to it. This is illustrated below: (b) Hash tables are not very useful if we want to keep keep track of the minimum key. In particular, the running time to find a minimum key is O(m + n) if the hash table has m slots and n keys. This is because we have to examine all m slots (even those that are empty) and we also have to go through each linked-list in every slot and thus examine all n keys.
. Binary Search Trees. (a) The binary search tree that we build is: (b) Suppose P = {k 1, k 2,..., k l } are the keys we find when searching for k. (In these examples the search was successful so k l = k.) Now let X = {x 1, x 2,..., x r } be those keys of P (in order) that are at most k and let Y = {y 1, y 2,..., y s } be those keys of P (in order) that are at least k. In a search on a BST, it must be the case that x 1 x 2 x r. Suppose not and that x i > x i+1. Since x i k, after we examine x i we then search its right subtree R i. But, by definition, every key in R i is larger than x i. In particular, as x i+1 R i we must have x i x i+1, a contradiction. A similar argument shows that y 1 y 2 y r. i. X = {2,,, 6, 1} and Y = {,,,, 0, 1}. This is not a valid BST search sequence as < in Y. ii. X = {1, 6, 6,, 1, 0,, 6,, 1} and Y = {6,, 1,, 6,, 1}. This is a valid BST search sequence. iii. X = {1, 1, 0, 1, 0, 6, 1} and Y = {0,, 2, 1}. This is not a valid BST search sequence as 0 < 1 in X.
. Binary Search Trees. Let P be the path from v 0 up to the root r. Define X = {x 0, x 1, x 2,..., x t } where x 0 = v 0 and, for i 0, x i+1 is the closest ancestor of x i on P whose right child is an ancestor of x i. (Possibly x t is the root, but it need not be.) It takes O(d) time to find all the vertices in X as the length of P is at most d = depth(t ). Let L i be the left subtree at x i, and let L i = λ i. Now every vertex that is not in X or one of its left subtrees has a key greater than v 0 = x 0. So to find the {v 1,..., v l } it suffices to consider just the nodes in X t i=0 L i. Observe that vertices in L i are in the right subtree of x i+1, thus they have larger keys than x i+1. It follows that the keys increase according to the order: {L t, x t, L t 1, x t 1,..., L 1, x 1, L 0, x 0 } So the nodes in L 0 are the immediate predecessors of x 0 = v 0, next comes x 1, then L 1, etc. Assume that {L s, x s, L s 1, x s 1,..., L 1, x 1, L 0 } l and that {L s 1, x s 1,..., L 1, x 1, L 0 } < l. We saw in class the we can sort L i (or any subtree of a BST) in linear time via in-order DFS. Thus we can sort all of L 0, L 1,..., L s 1 in time O( s 1 i=0 λ i) = O(l). The largest l {x s, L s 1, x s 1,..., L 1, x 1, L 0 } keys in L s are then required to complete the list {v l,..., v 2, v 1 }. If λ s = O(l) we can just sort L s. Otherwise, we find the maximum key in L s by following the rightmost path Q = {q 1, q 2,..., q d } we then recursively order their left subtrees ˆL d, ˆL d 1,... until we have found l {x s, L s 1, x s 1,..., L 1, x 1, L 0 } keys in L s. The total run time is O(d + l). 6
6. Data Structures for Disjoint Sets. The depth of the resulting tree T is again O(log T ). vertices the find(v) operation still takes O(log n) time. So for a tree with n We prove by induction that a tree with depth d has at least 2 d vertices. For the base cases, if d = 0 then T has at least 1 = 2 0 vertices as required; similarly if d = 1 then T has at least 2 = 2 1 vertices as required. Now consider a tree T with depth d + 1. Consider the time that T first had depth d+1. Assume this happened when we applied the operation merge(x, y) to merge two trees T x and T y. We may assume that x pointed to y because T x = ω(x) ω(y) = T y. Before the merge it must be the case that depth(t x ) was exactly d. Thus by induction T x 2 d. But by definition T x T y. So T 2 T x 2 d+1 as desired.