Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

Similar documents
Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

Data Structures and Algorithms " Search Trees!!

AVL Trees. Manolis Koubarakis. Data Structures and Programming Techniques

Dictionary: an abstract data type

Dictionary: an abstract data type

CS 240 Data Structures and Data Management. Module 4: Dictionaries

INF2220: algorithms and data structures Series 1

past balancing schemes require maintenance of balance info at all times, are aggresive use work of searches to pay for work of rebalancing

Splay Trees. CMSC 420: Lecture 8

Fundamental Algorithms

Priority queues implemented via heaps

Binary Search Trees. Dictionary Operations: Additional operations: find(key) insert(key, value) erase(key)

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

CS Data Structures and Algorithm Analysis

8: Splay Trees. Move-to-Front Heuristic. Splaying. Splay(12) CSE326 Spring April 16, Search for Pebbles: Move found item to front of list

Lecture 2 14 February, 2007

Problem. Maintain the above set so as to support certain

Ordered Dictionary & Binary Search Tree

Binary Search Trees. Motivation

Graphs and Trees Binary Search Trees AVL-Trees (a,b)-trees Splay-Trees. Search Trees. Tobias Lieber. April 14, 2008

Lecture 5: Splay Trees

16. Binary Search Trees. [Ottman/Widmayer, Kap. 5.1, Cormen et al, Kap ]

16. Binary Search Trees. [Ottman/Widmayer, Kap. 5.1, Cormen et al, Kap ]

Nearest Neighbor Search with Keywords

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October

Binary Search Trees. Lecture 29 Section Robb T. Koether. Hampden-Sydney College. Fri, Apr 8, 2016

7 Dictionary. 7.1 Binary Search Trees. Binary Search Trees: Searching. 7.1 Binary Search Trees

Chapter 6. Self-Adjusting Data Structures

Data Structures 1 NTIN066

Data Structures 1 NTIN066

Notes on Logarithmic Lower Bounds in the Cell Probe Model

Outline for Today. Static Optimality. Splay Trees. Properties of Splay Trees. Dynamic Optimality (ITA) Balanced BSTs aren't necessarily optimal!

Programming Binary Search Tree. Virendra Singh Indian Institute of Science Bangalore Lecture 12

AVL trees. AVL trees

Outline for Today. Static Optimality. Splay Trees. Properties of Splay Trees. Dynamic Optimality (ITA) Balanced BSTs aren't necessarily optimal!

Inge Li Gørtz. Thank you to Kevin Wayne for inspiration to slides

8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm

Illinois Institute of Technology Department of Computer Science. Splay Trees. CS 535 Design and Analysis of Algorithms Fall Semester, 2018

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26

AVL Trees. Properties Insertion. October 17, 2017 Cinda Heeren / Geoffrey Tien 1

Data Structures and and Algorithm Xiaoqing Zheng

Functional Data Structures

Assignment 5: Solutions

Outline. Computer Science 331. Cost of Binary Search Tree Operations. Bounds on Height: Worst- and Average-Case

Fundamental Algorithms

Amortized Complexity Main Idea

CS361 Homework #3 Solutions

Part III. Data Structures. Abstract Data Type. Dynamic Set Operations. Dynamic Set Operations

RANDOMIZED ALGORITHMS

Splay trees (Sleator, Tarjan 1983)

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

Weight-balanced Binary Search Trees

Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs

Weight-balanced Binary Search Trees

Data Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

Skip Lists. What is a Skip List. Skip Lists 3/19/14

Skip List. CS 561, Lecture 11. Skip List. Skip List

The null-pointers in a binary search tree are replaced by pointers to special null-vertices, that do not carry any object-data

1. Introduction Bottom-Up-Heapsort is a variant of the classical Heapsort algorithm due to Williams ([Wi64]) and Floyd ([F64]) and was rst presented i

16. Binary Search Trees

Chapter 10 Search Structures

16. Binary Search Trees

Verified Analysis of Functional Data Structures

Alpha-Beta Pruning: Algorithm and Analysis

Fibonacci Heaps These lecture slides are adapted from CLRS, Chapter 20.

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Alpha-Beta Pruning: Algorithm and Analysis

CS 161: Design and Analysis of Algorithms

Lecture VI AMORTIZATION

Premaster Course Algorithms 1 Chapter 3: Elementary Data Structures

Alpha-Beta Pruning: Algorithm and Analysis

CS 161 Summer 2009 Homework #2 Sample Solutions

7 Dictionary. EADS c Ernst Mayr, Harald Räcke 109

Discrete Mathematics

arxiv: v1 [math.co] 22 Jan 2013

Slides for CIS 675. Huffman Encoding, 1. Huffman Encoding, 2. Huffman Encoding, 3. Encoding 1. DPV Chapter 5, Part 2. Encoding 2

Biased Quantiles. Flip Korn Graham Cormode S. Muthukrishnan

Lecture 14: Nov. 11 & 13

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort

Data selection. Lower complexity bound for sorting

Lecture 1 : Data Compression and Entropy

Scout, NegaScout and Proof-Number Search

Data Structures and Algorithms

Skylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland

CS60007 Algorithm Design and Analysis 2018 Assignment 1

7 Dictionary. Harald Räcke 125

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

AVL-Trees for Localized Search

University of Toronto Department of Electrical and Computer Engineering. Final Examination. ECE 345 Algorithms and Data Structures Fall 2016

7.3 AVL-Trees. Definition 15. Lemma 16. AVL-trees are binary search trees that fulfill the following balance condition.

Each internal node v with d(v) children stores d 1 keys. k i 1 < key in i-th sub-tree k i, where we use k 0 = and k d =.

CS 151. Red Black Trees & Structural Induction. Thursday, November 1, 12

Review Of Topics. Review: Induction

data structures and algorithms lecture 2

Loop Invariants and Binary Search. Chapter 4.4, 5.1

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 )

Transcription:

Search Trees Chapter 1 < 6 2 > 1 4 = 8 9-1 -

Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 2 -

Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 3 -

Binary Search Trees Ø A binary search tree is a proper binary tree storing key-value entries at its internal nodes and satisfying the following property: q Let u, v, and w be three nodes such that u is in the left subtree of v and w is in the right subtree of v. We have key(u) key(v) key(w) Ø We will assume that eternal nodes are placeholders : they do not store entries (makes algorithms a little simpler) Ø An inorder traversal of a binary search trees visits the keys in increasing order Ø Binary search trees are ideal for maps or dictionaries with ordered keys. 6 2 9 1 4 8-4 -

Binary Search Tree All nodes in left subtree Any node All nodes in right subtree 38 25 51 17 31 42 63 4 21 28 35 4 49 55 71-5 -

Ø Maintain a sub-tree. Search: Loop Invariant Ø If the key is contained in the original tree, then the key is contained in the sub-tree. key 17 38 25 51 17 31 42 63 4 21 28 35 4 49 55 71-6 -

Ø Cut sub-tree in half. Search: Define Step Ø Determine which half the key would be in. Ø Keep that half. key 17 38 25 51 17 31 42 63 4 21 28 35 4 49 55 71 If key < root, then key is in left half. If key = root, then key is found If key > root, then key is in right half. - 7 -

Search: Algorithm Ø To search for a key k, we trace a downward path starting at the root Ø The net node visited depends on the outcome of the comparison of k with the key of the current node Ø If we reach a leaf, the key is not found and return of an eternal node signals this. Ø Eample: find(4): q Call TreeSearch(4,root) Algorithm TreeSearch(k, v) if T.isEternal (v) return v if k < key(v) return TreeSearch(k, T.left(v)) else if k = key(v) return v else { k > key(v) } return TreeSearch(k, T.right(v)) < 6 2 > 1 4 = 8 9-8 -

Insertion Ø To perform operation insert(k, o), we search for key k (using TreeSearch) Ø Suppose k is not already in the tree, and let w be the leaf reached by the search Ø We insert k at node w and epand w into an internal node Ø Eample: insert 5 w < 6 6 2 > 1 4 > 8 w 9 2 1 4 8 w 5 9-9 -

Insertion Ø Suppose k is already in the tree, at node v. Ø We continue the downward search through v, and let w be the leaf reached by the search Ø Note that it would be correct to go either left or right at v. We go left by convention. Ø We insert k at node w and epand w into an internal node Ø Eample: insert 6 2 < > 1 4 8 > w 6 9 2 w 1 4 8 w 6 6 9-1 -

Deletion Ø To perform operation remove(k), we search for key k Ø Suppose key k is in the tree, and let v be the node storing k Ø If node v has an eternal leaf child w, we remove v and w from the tree with operation removeeternal(w), which removes w and its parent Ø Eample: remove 4 < 6 6 2 > 1 4 v 8 w w 5 9 2 1 5 8 9-11 -

Deletion (cont.) Ø Now consider the case where the key k to be removed is stored at a node v whose children are both internal q we find the internal node w that follows v in an inorder traversal q we copy the entry stored at w into node v q we remove node w and its left child (which must be a leaf) by means of operation removeeternal() Ø Eample: remove 3 1 3 v 1 5 v 2 8 2 8 w 5 6 9 6 9-12 -

Performance Ø Consider a dictionary with n items implemented by means of a linked binary search tree of height h q the space used is O(n) q methods find, insert and remove take O(h) time Ø The height h is O(n) in the worst case and O(log n) in the best case Ø It is thus worthwhile to balance the tree (net topic)! - 13 -

Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 14 -

AVL Trees Ø The AVL tree is the first balanced binary search tree ever invented. Ø It is named after its two inventors, G.M. Adelson-Velskii and E.M. Landis, who published it in their 1962 paper "An algorithm for the organiation of information. - 15 -

AVL Trees Ø AVL trees are balanced. Ø An AVL Tree is a binary search tree in which the heights of siblings can differ by at most 1. 4 44 2 3 17 78 1 2 32 5 88 1 1 1 48 62 height - 16 -

Height of an AVL Tree Ø Claim: The height of an AVL tree storing n keys is O(log n). - 17 -

Height of an AVL Tree Ø Proof: We compute a lower bound n(h) on the number of internal nodes of an AVL tree of height h. Ø Observe that n(1) = 1 and n(2) = 2 Ø For h > 2, a minimal AVL tree contains the root node, one minimal AVL subtree of height h - 1 and another of height h - 2. Ø That is, n(h) = 1 + n(h - 1) + n(h - 2) Ø Knowing n(h - 1) > n(h - 2), we get n(h) > 2n(h - 2). So n(h) > 2n(h - 2), n(h) > 4n(h - 4), n(h) > 8n(n - 6), > 2 i n(h - 2i) Ø If h is even, we let i = h/2-1, so that n(h) > 2 h/2-1 n(2) = 2 h/2 Ø If h is odd, we let i = h/2-1/2, so that n(h) > 2 h/2-1/2 n(1) = 2 h/2-1/2 Ø In either case, n(h) > 2 h/2-1 Ø Taking logarithms: h < 2log(n(h)) +2 Ø Thus the height of an AVL tree is O(log n) n(2) 3 4 n(1) - 18 -

END OF LECTURE MAR 6, 214-19 -

Insertion height = 2 7 height = 3 7 1 4 Insert(2) 2 4 Problem! 1 2-2 -

Insertion Ø Imbalance may occur at any ancestor of the inserted node. height = 3 7 height = 4 7 2 4 1 8 Insert(2) 3 4 Problem! 1 8 1 3 1 5 2 3 1 5 1 2-21 -

Ø Step 1: Search Insertion: Rebalancing Strategy q Starting at the inserted node, traverse toward the root until an imbalance is discovered. height = 4 7 3 4 Problem! 1 8 2 3 1 5 1 2-22 -

Insertion: Rebalancing Strategy Ø Step 2: Repair q The repair strategy is called trinode restructuring. q 3 nodes, y and are distinguished: ² = the parent of the high sibling ² y = the high sibling ² = the high child of the high sibling q We can now think of the subtree 3 rooted at as consisting of these 3 nodes plus their 4 subtrees 1 2 2 3 height = 4 7 1 4 Problem! 8 1 5-23 -

Ø Step 2: Repair Insertion: Rebalancing Strategy q The idea is to rearrange these 3 nodes so that the middle value becomes the root and the other two becomes its children. q Thus the grandparent parent child structure becomes a triangular parent two children structure. q Note that must be either bigger than both and y or smaller than both and y. q Thus either or y is made the root of this subtree. height = h h-1 y T 2 q Then the subtrees T are attached at the appropriate places. T T 1 q Since the heights of subtrees T differ by at most 1, the resulting tree is balanced. - 24 - one is & one is h-4

Insertion: Trinode Restructuring Eample Note that y is the middle value. height = h h-1 y Restructure h-1 y T 2 T T 1 T 2 T T 1 one is & one is h-4 one is & one is h-4-25 -

Insertion: Trinode Restructuring - 4 Cases Ø There are 4 different possible relationships between the three nodes, y and before restructuring: y y y y height = h height = h height = h height = h y h-1 y h-1 T y h-1 y h-1 T T 2 T 1 T T T 1 T 2 T 1 T 2 T 1 T 2 one is & one is h-4 one is & one is h-4 one is & one is h-4 one is & one is h-4-26 -

Insertion: Trinode Restructuring - 4 Cases Ø This leads to 4 different solutions, all based on the same principle. y y y y height = h height = h height = h height = h y h-1 y h-1 T y h-1 y h-1 T T 2 T 1 T T T 1 T 2 T 1 T 2 T 1 T 2 one is & one is h-4 one is & one is h-4 one is & one is h-4 one is & one is h-4-27 -

Insertion: Trinode Restructuring - Case 1 Note that y is the middle value. height = h h-1 y Restructure h-1 y T 2 T T 1 T 2 T T 1 one is & one is h-4 one is & one is h-4-28 -

Insertion: Trinode Restructuring - Case 2 height = h h-1 y y h-1 T Restructure T 1 T T 1 T 2 T 2 one is & one is h-4 one is & one is h-4-29 -

Insertion: Trinode Restructuring - Case 3 height = h h-1 h-1 y Restructure y T T T 1 T 2 T 1 T 2 one is & one is h-4 one is & one is h-4-3 -

Insertion: Trinode Restructuring - Case 4 height = h h-1 y h-1 T Restructure y T T 1 T 2 T 1 T 2 one is & one is h-4 one is & one is h-4-31 -

Insertion: Trinode Restructuring - The Whole Tree Ø Do we have to repeat this process further up the tree? Ø No! q The tree was balanced before the insertion. q Insertion raised the height of the subtree by 1. q Rebalancing lowered the height of the subtree by 1. q Thus the whole tree is still balanced. height = h h-1 y Restructure h-1 y T 2 T T 1 T 2 T T 1 one is & one is h-4 one is & one is h-4-32 -

Removal Ø Imbalance may occur at an ancestor of the removed node. height = 3 7 height = 3 7 2 4 1 8 Remove(8) 2 4 Problem! 1 3 1 5 1 3 1 5-33 -

Removal: Rebalancing Strategy Ø Step 1: Search q Let w be the node actually removed (i.e., the node matching the key if it has a leaf child, otherwise the node following in an in-order traversal. q Starting at w, traverse toward the root until an imbalance is discovered. height = 3 7 2 4 Problem! 1 3 1 5-34 -

Removal: Rebalancing Strategy Ø Step 2: Repair q We again use trinode restructuring. q 3 nodes, y and are distinguished: height = 3 7 ² = the parent of the high sibling ² y = the high sibling ² = the high child of the high sibling (if children are equally high, keep chain linear) 1 3 2 4 1 Problem! 5-35 -

Removal: Rebalancing Strategy Ø Step 2: Repair q The idea is to rearrange these 3 nodes so that the middle value becomes the root and the other two becomes its children. q Thus the grandparent parent child structure becomes a triangular parent two children structure. q Note that must be either bigger than both and y or smaller than both and y. q Thus either or y is made the root of this subtree, and is lowered by 1. q Then the subtrees T are attached at the appropriate places. q Although the subtrees T can differ in height by up to 2, after restructuring, sibling subtrees will differ by at most 1. T T 1 height = h h-1 y or or & h-4 T 2-36 -

Removal: Trinode Restructuring - 4 Cases Ø There are 4 different possible relationships between the three nodes, y and before restructuring: y y y y height = h height = h height = h height = h y h-1 y h-1 T y h-1 y h-1 T or T 2 T 1 or T T T 1 T 2 T 1 T 2 T 1 T 2 or & h-4 or & h-4 or & h-4 or & h-4-37 -

Removal: Trinode Restructuring - Case 1 Note that y is the middle value. height = h Restructure h or h-1 y h-1 y h-1 or T T 1 or T 2 T T 1 T 2 or & h-4 or or & h-4-38 -

Removal: Trinode Restructuring - Case 2 height = h h or h-1 y y h-1 T Restructure h-1 or T 1 or T T 1 T 2 T 2 or or & h-4 or & h-4-39 -

Removal: Trinode Restructuring - Case 3 height = h h-1 h-1 y Restructure y T T T 1 T 2 T 1 T 2 or & h-4 or & h-4-4 -

Removal: Trinode Restructuring - Case 4 height = h h-1 y h-1 T Restructure y T T 1 T 2 T 1 T 2 or & h-4 or & h-4-41 -

Ø Step 2: Repair Removal: Rebalancing Strategy q Unfortunately, trinode restructuring may reduce the height of the subtree, causing another imbalance further up the tree. q Thus this search and repair process must in the worst case be repeated until we reach the root. - 42 -

Java Implementation of AVL Trees Ø Please see tet - 43 -

Running Times for AVL Trees Ø a single restructure is O(1) q using a linked-structure binary tree Ø find is O(log n) q height of tree is O(log n), no restructures needed Ø insert is O(log n) q initial find is O(log n) q Restructuring is O(1) Ø remove is O(log n) q initial find is O(log n) q Restructuring up the tree, maintaining heights is O(log n) - 44 -

AVLTree Eample - 45 -

Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 46 -

Splay Trees Ø Self-balancing BST Ø Invented by Daniel Sleator and Bob Tarjan Ø Allows quick access to recently accessed elements D. Sleator Ø Bad: worst-case O(n) Ø Good: average (amortied) case O(log n) Ø Often perform better than other BSTs in practice R. Tarjan - 47 -

Splaying Ø Splaying is an operation performed on a node that iteratively moves the node to the root of the tree. Ø In splay trees, each BST operation (find, insert, remove) is augmented with a splay operation. Ø In this way, recently searched and inserted elements are near the top of the tree, for quick access. - 48 -

3 Types of Splay Steps Ø Each splay operation on a node consists of a sequence of splay steps. Ø Each splay step moves the node up toward the root by 1 or 2 levels. Ø There are 2 types of step: q Zig-Zig q Zig-Zag q Zig Ø These steps are iterated until the node is moved to the root. - 49 -

Zig-Zig Ø Performed when the node forms a linear chain with its parent and grandparent. q i.e., right-right or left-left y y T 4 ig-ig T 1 T 2 T 1 T 2 T 4-5 -

Zig-Zag Ø Performed when the node forms a non-linear chain with its parent and grandparent q i.e., right-left or left-right y ig-ag y T 1 T 4 T 1 T 2 T 4 T 2-51 -

Zig Ø Performed when the node has no grandparent q i.e., its parent is the root y ig T 4 w y w T 1 T 2 T 4 T 1 T 2-52 -

Splay Trees & Ordered Dictionaries Ø which nodes are splayed after each operation? method find(k) insert(k,v) splay node if key found, use that node if key not found, use parent of eternal node where search terminated use the new node containing the entry inserted remove(k) use the parent of the internal node w that was actually removed from the tree (the parent of the node that the removed item was swapped with) - 53 -

Recall BST Deletion Ø Now consider the case where the key k to be removed is stored at a node v whose children are both internal q we find the internal node w that follows v in an inorder traversal q we copy key(w) into node v q we remove node w and its left child (which must be a leaf) by means of operation removeeternal() Ø Eample: remove 3 which node will be splayed? 1 3 v 1 5 v 2 8 2 8 w 5 6 9 6 9-54 -

Note on Deletion Ø The tet (Goodrich, p. 463) uses a different convention for BST deletion in their splaying eample q Instead of deleting the leftmost internal node of the right subtree, they delete the rightmost internal node of the left subtree. q We will stick with the convention of deleting the leftmost internal node of the right subtree (the node immediately following the element to be removed in an inorder traversal). - 55 -

Splay Tree Eample - 56 -

END OF LECTURE MAR 11, 214-57 -

Performance Ø Worst-case is O(n) q Eample: ² Find all elements in sorted order ² This will make the tree a left linear chain of height n, with the smallest element at the bottom ² Subsequent search for the smallest element will be O(n) - 58 -

Ø Average-case is O(log n) Performance q Proof uses amortied analysis q We will not cover this Ø Operations on more frequently-accessed entries are faster. q Given a sequence of m operations, the running time to access entry i is: ( ) ( ) O log m / f (i) where f(i) is the number of times entry i is accessed. - 59 -

Ø (2, 4) Trees Other Forms of Search Trees q These are multi-way search trees (not binary trees) in which internal nodes have between 2 and 4 children q Have the property that all eternal nodes have eactly the same depth. q Worst-case O(log n) operations q Somewhat complicated to implement Ø Red-Black Trees q Binary search trees q Worst-case O(log n) operations q Somewhat easier to implement q Requires only O(1) structural changes per update - 6 -

Summary Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 61 -