Lecture 5: Splay Trees

Similar documents
Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

past balancing schemes require maintenance of balance info at all times, are aggresive use work of searches to pay for work of rebalancing

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

Splay Trees. CMSC 420: Lecture 8

Lecture 2 14 February, 2007

Graphs and Trees Binary Search Trees AVL-Trees (a,b)-trees Splay-Trees. Search Trees. Tobias Lieber. April 14, 2008

Outline for Today. Static Optimality. Splay Trees. Properties of Splay Trees. Dynamic Optimality (ITA) Balanced BSTs aren't necessarily optimal!

Chapter 6. Self-Adjusting Data Structures

Problem. Maintain the above set so as to support certain

CS Data Structures and Algorithm Analysis

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

Priority queues implemented via heaps

8: Splay Trees. Move-to-Front Heuristic. Splaying. Splay(12) CSE326 Spring April 16, Search for Pebbles: Move found item to front of list

7 Dictionary. Ernst Mayr, Harald Räcke 126

Splay trees (Sleator, Tarjan 1983)

Data Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

CS 240 Data Structures and Data Management. Module 4: Dictionaries

Lecture 7: Dynamic Programming I: Optimal BSTs

Amortized Complexity Main Idea

Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs

Data Structures 1 NTIN066

Dictionary: an abstract data type

INF2220: algorithms and data structures Series 1

Partitions and Covers

Dictionary: an abstract data type

Amortized Analysis (chap. 17)

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Tutorial 4. Dynamic Set: Amortized Analysis

Binary Search Trees. Lecture 29 Section Robb T. Koether. Hampden-Sydney College. Fri, Apr 8, 2016

Binary Search Trees. Motivation

Fibonacci Heaps These lecture slides are adapted from CLRS, Chapter 20.

CS4311 Design and Analysis of Algorithms. Analysis of Union-By-Rank with Path Compression

CS 466/666 Course Notes

Assignment 5: Solutions

Outline for Today. Static Optimality. Splay Trees. Properties of Splay Trees. Dynamic Optimality (ITA) Balanced BSTs aren't necessarily optimal!

8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm

16. Binary Search Trees. [Ottman/Widmayer, Kap. 5.1, Cormen et al, Kap ]

Review Of Topics. Review: Induction

Dynamic Ordered Sets with Exponential Search Trees

Nearest Neighbor Search with Keywords

7 Dictionary. EADS c Ernst Mayr, Harald Räcke 109

7 Dictionary. Harald Räcke 125

Lecture VI AMORTIZATION

Ordered Dictionary & Binary Search Tree

FIBONACCI HEAPS. preliminaries insert extract the minimum decrease key bounding the rank meld and delete. Copyright 2013 Kevin Wayne

Outline. Computer Science 331. Cost of Binary Search Tree Operations. Bounds on Height: Worst- and Average-Case

Biased Quantiles. Flip Korn Graham Cormode S. Muthukrishnan

arxiv: v2 [cs.ds] 30 Jun 2015

CS6840: Advanced Complexity Theory Mar 29, Lecturer: Jayalal Sarma M.N. Scribe: Dinesh K.

Data Structures 1 NTIN066

16. Binary Search Trees. [Ottman/Widmayer, Kap. 5.1, Cormen et al, Kap ]

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26

Part III. Data Structures. Abstract Data Type. Dynamic Set Operations. Dynamic Set Operations

Binary Search Trees. Dictionary Operations: Additional operations: find(key) insert(key, value) erase(key)

Amortized analysis. Amortized analysis

CS361 Homework #3 Solutions

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

Weight-balanced Binary Search Trees

Analysis of Algorithms. Outline 1 Introduction Basic Definitions Ordered Trees. Fibonacci Heaps. Andres Mendez-Vazquez. October 29, Notes.

searching algorithms

9 Union Find. 9 Union Find. List Implementation. 9 Union Find. Union Find Data Structure P: Maintains a partition of disjoint sets over elements.

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

More Asymptotic Analysis Spring 2018 Discussion 8: March 6, 2018

First Lemma. Lemma: The cost function C h has the ASI-Property. Proof: The proof can be derived from the definition of C H : C H (AUVB) = C H (A)

Weight-balanced Binary Search Trees

Join Ordering. Lemma: The cost function C h has the ASI-Property. Proof: The proof can be derived from the definition of C H :

CS 151. Red Black Trees & Structural Induction. Thursday, November 1, 12

Data Structures and Algorithms " Search Trees!!

Functional Data Structures

The null-pointers in a binary search tree are replaced by pointers to special null-vertices, that do not carry any object-data

Breadth First Search, Dijkstra s Algorithm for Shortest Paths

Disjoint Sets : ADT. Disjoint-Set : Find-Set

Each internal node v with d(v) children stores d 1 keys. k i 1 < key in i-th sub-tree k i, where we use k 0 = and k d =.

Lecture 2 September 4, 2014

Premaster Course Algorithms 1 Chapter 3: Elementary Data Structures

Divide-and-Conquer Algorithms Part Two

CS 395T Computational Learning Theory. Scribe: Mike Halcrow. x 4. x 2. x 6

Data selection. Lower complexity bound for sorting

In-Class Soln 1. CS 361, Lecture 4. Today s Outline. In-Class Soln 2

Analysis of Algorithms - Midterm (Solutions)

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 )

University of New Mexico Department of Computer Science. Final Examination. CS 561 Data Structures and Algorithms Fall, 2013

The Complexity of Constructing Evolutionary Trees Using Experiments

S(b)-Trees: an Optimal Balancing of Variable Length Keys. Konstantin V. Shvachko

Verified Analysis of Functional Data Structures

Heaps and Priority Queues

Part IA Algorithms Notes

Forking and Dividing in Random Graphs

Outline. Part 1: Motivation Part 2: Probabilistic Databases Part 3: Weighted Model Counting Part 4: Lifted Inference for WFOMC

Notes on Logarithmic Lower Bounds in the Cell Probe Model

Fundamental Algorithms

2.2 Asymptotic Order of Growth. definitions and notation (2.2) examples (2.4) properties (2.2)

CS 61B Asymptotic Analysis Fall 2017

Design and Analysis of Algorithms

1 Approximate Quantiles and Summaries

Data Structures and and Algorithm Xiaoqing Zheng

Sorting algorithms. Sorting algorithms

Lecture 1 : Data Compression and Entropy

Transcription:

15-750: Graduate Algorithms February 3, 2016 Lecture 5: Splay Trees Lecturer: Gary Miller Scribe: Jiayi Li, Tianyi Yang 1 Introduction Recall from previous lecture that balanced binary search trees (BST) are those whose max depths are O(log n). While this property guarantees balanced BST to have O(log n) worst-case performance, sometimes the worst-case performance may not be of the most interest. Indeed, suppose a worst-case query is searched multiple times, ideally we would want the data structure to adjust itself to accommodate the observed empirical frequency so that when the same query is searched again, a faster time can be achieved. The splay tree is a variant of BST that does exactly that. Every time Search(x) is called on a splay tree, (assume x is indeed in the tree), in addition to the searching, the splay tree also rotates x to the root using a specialized function Splay(x). Although splay trees are not balanced according to the worst-case standard, they are selfbalancing thanks to Splay(x) in that no information is stored to maintain the balance. In the long run, the amortized complexity works out to O(log n) per operation. 2 Splay(x) As mentioned above, the distinguished feature of splay trees is the function Splay(x). Splay(x) rotates a node x to the root using a series of three rules, which are described below. We give to definitions of these rules. The first is an imperative description using rotations, while the second describes the new BST as a change to an insertion order or priority order that determines the original tree. Note that an element can be moved up in a BST by moving it earlier in the insertion order. 1. Zig 2. Zig-Zag 1

3. Zig-Zig (the interesting case) Depending on the relative positions of x and its parent(x) and grandparent(x), one of the above rules will dictate how the rotation takes place. x will keep rotating up until it becomes the root. In addition to a set of rotation rules, Splay(x) can also be described by rules of changing priorities for a treap. Priority version of Splay: Given x, y, z such that parent(x) = y and parent(y) = z. move y in front of z move x in front of y One might ask why it is not sufficient to rotate x with parent(x). In other words, why include the Zig-Zig rule involving grandparent(x)? This can be illustrated through an example. Example 2.1. Let the keys be 0, 1, 2, 3, 4 and the priorities be (4, 3, 2, 1, 0). Suppose we do not include the Zig-Zig rule and simply rotate x with parent(x) each time. Consider the following sequence of operations. Move 0 to the front (MTF 0) Priorities: (4, 3, 2, 1, 0) (0, 4, 3, 2, 1) MTF 1 Priorities: (0, 4, 3, 2, 1) (1, 0, 4, 3, 2) 4 rotations 2

4 rotations MTF 2 Priorities: (1, 0, 4, 3, 2) (2, 1, 0, 4, 3) 3 rotations MTF 3 Priorities: (2, 1, 0, 4, 3) (3, 2, 1, 0, 4) 2 rotations MTF 4 Priorities: (3, 2, 1, 0, 4) (4, 3, 2, 1, 0) 3

1 rotation The total number of rotations is 4 + 4 + 3 + 2 + 1 = 14. In general, given key (n, n 1,..., 0) and insertion order (n, n 1,..., 0) if we leave out the Zig-Zig rule and splay 0,..., n in this order we get the following number of rotations performed. Thus the average number of rotation: θ(n). Operation MTF(0) MTF(1) MTF(2) MTF(n) n + 1 operations rotations n n n 1 1 Ω(n 2 ) On the other hand, if we include the Zig-zig rule, then the resulting trees would look as follows: 4

The total number of rotations is 6 + 2 + 2 + 1 + 1 = 12 < 14. 3 Splay Dictionary Operations We next show how one can use Splay to implement an efficient dictionary. The basic operations of splay trees all employ Splay(x, T ). 5

Algorithm 1 Splay Dictionary Operations function Insert(x,T ) VanillaInsert(x, T ) Splay(x, T ) end function function Delete(x,T ) Splay(x, T ), giving if B is empty then return A else Splay(largest(A), A), giving return end if end function function Lookup(x,T ) Search(x, T ) if x is found then return Splay(x, T ) else return Splay(parent(x), T ) end if end function Note that Splay(x, T ) is called even if Lookup(x) fails to find the query. This is because splay trees are designed to change all the time precisely to address the problem discussed in the introduction. 4 Amortized Analysis of Splay 4.1 Preliminaries We use potential method to study the amortized cost of Splay. In order to do so, we first need to define a potential function Φ(T ). Conceptually, Φ(T ) should be large for unbalanced trees. Definition 4.1. Let T be a binary search tree, S(x) is the number of nodes in subtree rooted at x. Definition 4.2. Let T be a binary search tree, the rank of node x is r(x) = log 2 (S (x)) 6

Definition 4.3. The potential of a binary search tree T is given by Φ (T ) = x T r(x) Example 4.4. If T is a line, then n Φ (T ) = log 2 (i) i=1 To see the last equality, first note that n log 2 (i) = log (n!) = Θ (n log (n)) i=1 log (n!) < log (n n ) = n log(n) log (n!) O (n log (n)) In addition, log (n!) = n log (i) > i=1 n log (i) > i= n 2 n i= n 2 ( n ) ( n ) ( n ) log = log 2 2 2 Hence log (n!) = Θ (n log (n)). log (n!) Ω (n log (n)) Example 4.5. If T is balanced, then Φ (T ) = Θ (n) Claim 4.6. For all T, Φ(T ) = Ω(n) and Φ(T ) = O(n log n). That is, lines and balanced trees provide upper and lower bounds of the complexity of the trees. Given the potential function, we are ready to defined the amortized cost (AC): AC = Unit-Cost + Φ where Unit-Cost Number of rule applications = depth Number of rotations = 2 2 4.2 Access Lemma In order to study the amortized cost of searching in a splay tree, we first need to understand the run time of Splay. The Access Lemma is the main result in this regard. We first state the Access Lemma. Lemma 4.7 (Access Lemma). AC (Splay (x)) 3 (r(root) r(x)) + 1 Corollary 4.8. AC (Splay (x)) 3 log n + 1 The proof of Access Lemma requires some preliminaries results regarding the rank, which we will show first. Lemma 4.9 (Simple Fact 1). If node x has two children a,b such that r(a) = r(b) = r, then r(x) = r > r. Proof. S(x) = S(a) + S(b) + 1 > S(a) + S(b) 2 r(a) + 2 r(b) = 2 r + 2 r = 2 r+1 r(x) r + 1 7

Figure 1: Simple Fact 1. r > r Figure 2: Simple Fact 2. r > r Corollary 4.10 (Simple Fact 2). If node x has two children a,b such that r(x) = r(a), then r(x) > r(b) Corollary 4.11 (Simple Fact 3). If node x has two children a,b such that r(x) = r(a), then 2r(x) > r(a) + r(b) Figure 3: Simple Fact 2. 2r > r + r Proof of Access Lemma Proof. Throughout the proof, we will let C, P, GP denote child, parent, grandparent, respectively. We claim that it is sufficient to show that This can been illustrated through an example. AC(Zig) 3 (r (root) r (C)) + 1 (1) AC(Zig-zag), AC(Zig-zag) 3 (r (GP ) r (C)) (2) 8

In the figure above, Splay(x) takes four steps. Each r i represents the rank at the corresponding node. If (1) and (2) are true, then AC (Splay(X)) = AC (Zig-zag) + AC (Zig-zig) + AC (Zig-zag) + AC (Zig) 3 (r 1 r 0 ) + 3 (r 2 r 1 ) + 3 (r 3 r 2 ) + 3 (r 4 r 3 ) + 1 = 3 (r 4 r 0 ) 1 which is the statement of Access Lemma. We first show (1): As labeled in the above figures, the rank of the root r stays constant throughout the rotation. Furthermore, Hence Φ(After Zig) = r + r + Φ(α) + Φ(β) + Φ(γ) Φ(Before Zig) = r + r + Φ(α) + Φ(β) + Φ(γ) AC(Zig) = 1 + Φ = 1 + r r 1 + r r 3(r r ) + 1 since r(p ) r(c) 9

We now show (2): In the following portion of the proof, we let r = r(gp ), b = r(p ), a = r(c) be the ranks before the rotations. Case 1: r > a Hence, and Case 2: r = a Φ(Before) = r + b + a + Φ(subtrees) r + 2a + Φ(subtrees) Φ(After) 3r + Φ(subtrees) Φ 3r (r + 2a) = 2(r a) AC = 1 + Φ 1 + 2(r a) 3(r a) Since r b a, r = a implies r = b = a. Hence Φ(Before) = r + b + a + Φ(subtrees) = 3r + Φ(subtrees) We claim that Φ(After) 3r 1 + Φ(subtrees). In the case of Zig-zag, the tree after the rotation would have the following structure with C as the new root: By 4.11 we have r(gp ) + r(p ) < 2(C). Therefore, Φ(After) = r(c)+r(gp )+r(p )+Φ(subtrees) 3r(C) 1+Φ(subtrees) = 3r 1+Φ(subtrees) In the case of Zig-Zig, we have Note that from the left stage to the middle stage, the subtree rooted at C is the same. Hence r(c) = r for both figures. In the middle stage, r(p ) = r(root) = r. Hence r(gp ) < r by 4.10. And since the subtree rooted at GP is the same for the middle and the right stages, r(gp ) < r after the rotation. Therefore, Φ(After) = r(c)+r(p )+r(gp )+Φ(subtrees) r+r+(r 1)+Φ(subtrees) = 3r 1+Φ(subtrees) Therefore for both Zig-zag and Zig-zig. and Φ = Φ(after) Φ(before) 1 AC = 1 + Φ 1 + ( 1) = 0 3(r r) 10