Data Structures and Algorithms " Search Trees!!
Outline" Binary Search Trees! AVL Trees! (2,4) Trees! 2
Binary Search Trees! "! < 6 2 > 1 4 = 8 9
Ordered Dictionaries" Keys are assumed to come from a total order.! New operations:! first(): first entry in the dictionary ordering! last(): last entry in the dictionary ordering! successors(k): iterator of entries with keys greater than or equal to k; increasing order! predecessors(k): iterator of entries with keys less than or equal to k; decreasing order! 4
Binary Search" Binary search can perform operation find(k) on a dictionary implemented by means of an array-based sequence, sorted by key! similar to the high-low game! at each step, the number of candidate items is halved! terminates after O(log n) steps! Example: find(7)! 0 l 0 l 0 0 1 3 4 5 7 8 9 11 14 16 18 19 m h 1 3 4 5 7 8 9 11 14 16 18 19 m h 1 3 4 5 7 8 9 11 14 16 18 19 l m h 1 3 4 5 7 8 9 11 14 16 18 19 l=m =h 5
Search Tables" A search table is a dictionary implemented by means of a sorted sequence! We store the items of the dictionary in an array-based sequence, sorted by key! We use an external comparator for the keys! Performance:! find takes O(log n) time, using binary search! insert takes O(n) time since in the worst case we have to shift n items to make room for the new item! remove take O(n) time since in the worst case we have to shift n-1 items to compact the items after the removal! The lookup table is effective only for dictionaries of small size or for dictionaries on which searches are the most common operations, while insertions and removals are rarely performed (e.g., credit card authorizations)! 6
Binary Search Trees" A binary search tree is a binary tree storing keys (or key-value entries) at its internal nodes and satisfying the following property:! Let u, v, and w be three nodes such that u is in the left subtree of v and w is in the right subtree of v. We have key(u) key(v) key(w) External nodes do not store items! An inorder traversal of a binary search trees visits the keys in increasing order! 2 1 4 8 6 9 7
Search" To search for a key k, we trace a downward path starting at the root! The next node visited depends on the outcome of the comparison of k with the key of the current node! If we reach a leaf, the key is not found and we return null! Example: find(4):! Call TreeSearch(4,root)! Algorithm TreeSearch(k, v) if T.isExternal (v) return v // v has null value if k < key(v) return TreeSearch(k, T.left(v)) else if k = key(v) return v else { k > key(v) } return TreeSearch(k, T.right(v)) 2 < > 1 4 = 8 6 9 8
Insertion" To perform operation insert(k, o), we search for key k (using TreeSearch)! Assume k is not already in the tree, and let w be the leaf reached by the search! We insert k at node w and expand w into an internal node! Example: insert 5! < 6 2 > 1 4 > 8 w 6 2 1 4 8 w 5 9 9 9
Deletion" To perform operation remove(k), we search for key k Assume key k is in the tree, and let let v be the node storing k If node v has a leaf child w, we remove v and w from the tree with operation removeexternal(w), which removes w and its parent! Example: remove 4! 2 > 1 4 v 8 w 5 2 < 1 5 8 6 6 9 9 10
Deletion (cont.)" We consider the case where the key k to be removed is stored at a node v whose children are both internal! we find the internal node w that follows v in an inorder traversal! 1 2 z v 3 w 5 8 6 9 we copy key(w) into node v! we remove node w and its left child z (which must be a leaf) by means of operation removeexternal(z)! Example: remove 3! 1 2 5 v 8 6 9 11
Consider a dictionary with n items implemented by means of a binary search tree of height h! the space used is O(n)! methods find, insert and remove take O(h) time! The height h is O(n) in the worst case and O(log n) in the best case! Performance" 12
AVL Trees" v 6 3 8 4 z
AVL Tree Definition" AVL trees are balanced.! An AVL Tree is a binary search tree such that for every internal node v of T, the heights of the children of v can differ by at most 1.! 2 44 4 3 17 78 1 2 32 50 1 48 62 An example of an AVL tree where the heights are shown next to the nodes: 1 14 88 1
n(2) 3 Height of an AVL Tree" 4 n(1) Fact: The height of an AVL tree storing n keys is O(log n).! Proof: Let us bound n(h): the minimum number of internal nodes of an AVL tree of height h.! We easily see that n(1) = 1 and n(2) = 2! For n > 2, an AVL tree of height h contains the root node, one AVL subtree of height h-1 and another of height h-2.! That is, n(h) = 1 + n(h-1) + n(h-2)! Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2). So! n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), (by induction),! n(h) > 2 i n(h-2i)! Solving the base case we get: n(h) > 2 h/2-1! Taking logarithms: h < 2log n(h) +2! Thus the height of an AVL tree is O(log n)! 15
Insertion in an AVL Tree" Insertion is as in a binary search tree! Always done by expanding an external node.! Example:! 44 44 17 78 17 78 a=y c=z 32 50 88 32 50 88 48 62 48 62 b=x w 54 before insertion after insertion 16
Trinode Restructuring" let (a,b,c) be an inorder listing of x, y, z! perform the rotations needed to make b the topmost node of the three! a=z b=y (other two cases are symmetrical) a=z c=y case 2: double rotation (a right rotation about c, then a left rotation about a) T 0 c=x T 0 b=x T 1 b=y T 3 b=x T 2 T 3 a=z c=x T 1 T 2 a=z c=y case 1: single rotation (a left rotation about a) T 0 T 1 T 2 T 3 T 0 T 1 T 2 T 3 17
Insertion Example, continued" 5 44 z 2 64 17 78 2 1 3 y 32 1 50 4 2 1 x 48 3 62 1 unbalanced......balanced 54 5 T T 0 2 T 1 2 17 7 1 88 T 3 4 44 3 1 2 2 32 1 50 78 1 48 2 y 4 3 1 The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart 54 your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again. 62 x 5 T 2 z 6 88 7 1 T 0 T 1 18T 3
Restructuring " " " "(as Single Rotations)" Single Rotations:! a = z b = y c = x single rotation a = z b = y c = x T 0 T 1 T 2 T 3 T 0 T 1 T 2 T 3 a = x b = y c = z single rotation a = x b = y c = z T 3 T 3 T 0 T 1 T 2 T 2 T 1 T 0 19
Restructuring " " " " "(as Double Rotations)" double rotations:! a = z b = x c = y double rotation a = z b = x c = y T 0 T 2 T 1 T 3 T 0 T 1 T 2 T 3 a = y b = x c = z double rotation a = y b = x c = z T 0 T 3 T 2 T 3 T 1 T 0 T 1 T 2 20
Removal in an AVL Tree" Removal begins as in a binary search tree, which means the node removed will become an empty external node. Its parent, w, may cause an imbalance.! Example:! 44 44 17 62 17 62 32 50 78 50 78 48 54 88 48 54 88 before deletion of 32 after deletion 21
Rebalancing after a Removal" Let z be the first unbalanced node encountered while travelling up the tree from w. Also, let y be the child of z with the larger height, and let x be the child of y with the larger height.! We perform restructure(x) to restore balance at z.! As this restructuring may upset the balance of another node higher in the tree, we must continue checking for balance until the root of T is reached! a=z 44 62 w 17 62 b=y 44 78 50 78 c=x 17 50 88 48 54 88 48 54 22
Running Times for AVL Trees" a single restructure is O(1)! using a linked-structure binary tree! find is O(log n)! height of tree is O(log n), no restructures needed! insert is O(log n)! initial find is O(log n)! Restructuring up the tree, maintaining heights is O(log n)! remove is O(log n)! initial find is O(log n)! Restructuring up the tree, maintaining heights is O(log n)! 23
(2,4) Trees" 9 2 5 7 10 14
Multi-Way Search Tree" A multi-way search tree is an ordered tree such that! Each internal node has at least two children and stores d -1 key-element items (k i, o i ), where d is the number of children! For a node with children v 1 v 2 v d storing keys k 1 k 2 k d-1! keys in the subtree of v 1 are less than k 1 keys in the subtree of v i are between k i-1 and k i (i = 2,, d - 1) keys in the subtree of v d are greater than k d-1 The leaves store no items and serve as placeholders! 11 24 2 6 8 15 27 32 30 25
Multi-Way Inorder Traversal" We can extend the notion of inorder traversal from binary trees to multi-way search trees! Namely, we visit item (k i, o i ) of node v between the recursive traversals of the subtrees of v rooted at children v i and v i + 1 An inorder traversal of a multi-way search tree visits the keys in increasing order! 11 24 8 12 2 6 8 15 27 32 2 4 6 10 14 18 1 3 5 7 9 11 13 16 19 30 26 15 17
Multi-Way Searching" Similar to search in a binary search tree! A each internal node with children v 1 v 2 v d and keys k 1 k 2 k d-1! k = k i (i = 1,, d - 1): the search terminates successfully k < k 1 : we continue the search in child v 1 k i-1 < k < k i (i = 2,, d - 1): we continue the search in child v i k > k d-1 : we continue the search in child v d! Reaching an external node terminates the search unsuccessfully! Example: search for 30! 11 24 2 6 8 15 27 32 30 27
(2,4) Trees" A (2,4) tree (also called 2-4 tree or 2-3-4 tree) is a multi-way search with the following properties! Node-Size Property: every internal node has at most four children! Depth Property: all the external nodes have the same depth! Depending on the number of children, an internal node of a (2,4) tree is called a 2-node, 3-node or 4-node! 10 15 24 2 8 12 18 27 32 28
Height of a (2,4) Tree" Theorem: A (2,4) tree storing n items has height O(log n)!proof:! Let h be the height of a (2,4) tree with n items! Since there are at least 2 i items at depth i = 0,, h - 1 and no items at depth h, we have!! n 1 + 2 + 4 + + 2 h-1 = 2 h - 1 Thus, h log (n + 1) Searching in a (2,4) tree with n items takes O(log n) time! depth 0 1 h-1 h items 1 2 2 h-1 0 29
Insertion" We insert a new item (k, o) at the parent v of the leaf reached by searching for k We preserve the depth property but! We may cause an overflow (i.e., node v may become a 5-node)! Example: inserting key 30 causes an overflow! 10 15 24 2 8 12 18 v 27 32 35 30 10 15 24 2 8 12 18 27 30 32 35 v
Overflow and Split" We handle an overflow at a 5-node v with a split operation:! let v 1 v 5 be the children of v and k 1 k 4 be the keys of v! node v is replaced nodes v' and v" v' is a 3-node with keys k 1 k 2 and children v 1 v 2 v 3 v" is a 2-node with key k 4 and children v 4 v 5 key k 3 is inserted into the parent u of v (a new root may be created)! The overflow may propagate to the parent node u! u 15 24 v 12 18 27 30 32 35 u 15 24 32 v' 12 18 27 30 v" 35 v 1 v 2 v 3 v 4 v 5 31 v 1 v 2 v 3 v 4 v 5
Analysis of Insertion" Algorithm insert(k, o) 1. We search for key k to locate the insertion node v 2. We add the new entry (k, o) at node v 3. while overflow(v) if isroot(v) create a new empty root above v v split(v) Let T be a (2,4) tree with n items! Tree T has O(log n) height! Step 1 takes O(log n) time because we visit O(log n) nodes! Step 2 takes O(1) time! Step 3 takes O(log n) time because each split takes O(1) time and we perform O(log n) splits! Thus, an insertion in a (2,4) tree takes O(log n) time! 32
Deletion" We reduce deletion of an entry to the case where the item is at the node with leaf children! Otherwise, we replace the entry with its inorder successor (or, equivalently, with its inorder predecessor) and delete the latter entry! Example: to delete key 24, we replace it with 27 (inorder successor)! 10 15 24 2 8 12 18 27 32 35 10 15 27 2 8 12 18 32 35 33
Underflow and Fusion" Deleting an entry from a node v may cause an underflow, where node v becomes a 1-node with one child and no keys! To handle an underflow at node v with parent u, we consider two cases! Case 1: the adjacent siblings of v are 2-nodes! Fusion operation: we merge v with an adjacent sibling w and move an entry from u to the merged node v'! After a fusion, the underflow may propagate to the parent u u 9 14 2 5 7 10 w v u 2 5 7 9 v' 10 14 34
Underflow and Transfer" Case 2: an adjacent sibling w of v is a 3-node or a 4-node! Transfer operation:!!!1. we move a child of w to v!!!2. we move an item from u to v 3. we move an item from w to u! After a transfer, no underflow occurs! 2 u 4 9 w 6 8 v u 4 8 w v 2 6 9 35
Analysis of Deletion" Let T be a (2,4) tree with n items! Tree T has O(log n) height! In a deletion operation! We visit O(log n) nodes to locate the node from which to delete the entry! We handle an underflow with a series of O(log n) fusions, followed by at most one transfer! Each fusion and transfer takes O(1) time! Thus, deleting an item from a (2,4) tree takes O(log n) time! 36
References" Chapter 10: Data structures and Algorithms by Goodrich and Tamassia.! 37