Illinois Institute of Technology Department of Computer Science. Splay Trees. CS 535 Design and Analysis of Algorithms Fall Semester, 2018

Similar documents
Chapter 6. Self-Adjusting Data Structures

Splay trees (Sleator, Tarjan 1983)

CSE 5311 Notes 5: Trees. (Last updated 6/4/13 4:12 PM)

Outline for Today. Static Optimality. Splay Trees. Properties of Splay Trees. Dynamic Optimality (ITA) Balanced BSTs aren't necessarily optimal!

past balancing schemes require maintenance of balance info at all times, are aggresive use work of searches to pay for work of rebalancing

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

8: Splay Trees. Move-to-Front Heuristic. Splaying. Splay(12) CSE326 Spring April 16, Search for Pebbles: Move found item to front of list

A STUDY ON SPLAY TREES

Section 3.1. ; X = (0, 1]. (i) f : R R R, f (x, y) = x y

Hamiltonicity and Fault Tolerance

Splay Trees. CMSC 420: Lecture 8

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

8. BOOLEAN ALGEBRAS x x

We have examined power functions like f (x) = x 2. Interchanging x

Estimators in simple random sampling: Searls approach

The Steiner Ratio for Obstacle-Avoiding Rectilinear Steiner Trees

Flows and Connectivity

5. Zeros. We deduce that the graph crosses the x-axis at the points x = 0, 1, 2 and 4, and nowhere else. And that s exactly what we see in the graph.

Dictionary: an abstract data type

(a) We split the square up into four pieces, parametrizing and integrating one a time. Right side: C 1 is parametrized by r 1 (t) = (1, t), 0 t 1.

Green s Theorem Jeremy Orloff

8.1 Exponents and Roots

ES.182A Topic 36 Notes Jeremy Orloff

AVL trees. AVL trees

Lecture VI AMORTIZATION

Answer Explanations. The SAT Subject Tests. Mathematics Level 1 & 2 TO PRACTICE QUESTIONS FROM THE SAT SUBJECT TESTS STUDENT GUIDE

2.5 CONTINUITY. a x. Notice that Definition l implicitly requires three things if f is continuous at a:

Lecture 2 14 February, 2007

2.5. Infinite Limits and Vertical Asymptotes. Infinite Limits

INTRODUCTION TO DIFFERENTIAL EQUATIONS

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

Outline for Today. Static Optimality. Splay Trees. Properties of Splay Trees. Dynamic Optimality (ITA) Balanced BSTs aren't necessarily optimal!

MAT 1275: Introduction to Mathematical Analysis. Graphs and Simplest Equations for Basic Trigonometric Functions. y=sin( x) Function

Simplification of State Machines

Module 2: Trigonometry

Dictionary: an abstract data type

Homework Notes Week 6

Ch 3 Alg 2 Note Sheet.doc 3.1 Graphing Systems of Equations

Review Topics for MATH 1400 Elements of Calculus Table of Contents

Linear programming: Theory

Topic 3 Notes Jeremy Orloff

In everyday speech, a continuous. Limits and Continuity. Critical Thinking Exercises

KINEMATIC RELATIONS IN DEFORMATION OF SOLIDS

8 Differential Calculus 1 Introduction

Chapter 4 Analytic Trigonometry

. This is the Basic Chain Rule. x dt y dt z dt Chain Rule in this context.

Polynomial and Rational Functions

0.24 adults 2. (c) Prove that, regardless of the possible values of and, the covariance between X and Y is equal to zero. Show all work.

Mathematics. Polynomials and Quadratics. hsn.uk.net. Higher. Contents. Polynomials and Quadratics 1. CfE Edition

Part D. Complex Analysis

UNCORRECTED SAMPLE PAGES. 3Quadratics. Chapter 3. Objectives

6 = 1 2. The right endpoints of the subintervals are then 2 5, 3, 7 2, 4, 2 9, 5, while the left endpoints are 2, 5 2, 3, 7 2, 4, 9 2.

Limits and Continuous Functions. 2.2 Introduction to Limits. We first interpret limits loosely. We write. lim f(x) = L

Lecture 5: Splay Trees

MATH Line integrals III Fall The fundamental theorem of line integrals. In general C

5.6 RATIOnAl FUnCTIOnS. Using Arrow notation. learning ObjeCTIveS

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

Probability Theory Refresher

On the relation between the relative earth mover distance and the variation distance (an exposition)

Functions of Several Variables

7-6. nth Roots. Vocabulary. Geometric Sequences in Music. Lesson. Mental Math

Higher. Polynomials and Quadratics. Polynomials and Quadratics 1

Graphs and Trees Binary Search Trees AVL-Trees (a,b)-trees Splay-Trees. Search Trees. Tobias Lieber. April 14, 2008

CSE 546 Midterm Exam, Fall 2014

The Maze Generation Problem is NP-complete

Mathematics. Mathematics 2. hsn.uk.net. Higher HSN22000

On Range and Reflecting Functions About the Line y = mx

Mathematical Statistics. Gregg Waterman Oregon Institute of Technology

Lecture 2 September 4, 2014

6. This sum can be rewritten as 4( ). We then recall the formula n =

Lab 5 Forces Part 1. Physics 225 Lab. You will be using Newton s 2 nd Law to help you examine the nature of these forces.

a. plotting points in Cartesian coordinates (Grade 9 and 10), b. using a graphing calculator such as the TI-83 Graphing Calculator,

ADDITIVELY SEPARABLE FUNCTIONS

Math 123 Summary of Important Algebra & Trigonometry Concepts Chapter 1 & Appendix D, Stewart, Calculus Early Transcendentals

Bridge-Thickness Experiment. Student 2

APPENDIXES. B Coordinate Geometry and Lines C. D Trigonometry E F. G The Logarithm Defined as an Integral H Complex Numbers I

MAT 127: Calculus C, Fall 2010 Solutions to Midterm I

This document is stored in Documents/4C/nablaoperator.tex. Compile it with LaTex. VECTOR CALCULUS

2.2 SEPARABLE VARIABLES

Inge Li Gørtz. Thank you to Kevin Wayne for inspiration to slides

Design and Analysis of Algorithms

Cubic and quartic functions

CS Data Structures and Algorithm Analysis

arxiv: v2 [math.co] 21 Apr 2012

Vectors and the Geometry of Space

3.0 PROBABILITY, RANDOM VARIABLES AND RANDOM PROCESSES

Trusses - Method of Joints

Equations of lines in

Section 1.2: A Catalog of Functions

74 Maths Quest 10 for Victoria

10.2 The Unit Circle: Cosine and Sine

Ordinary Differential Equations

Glossary. Also available at BigIdeasMath.com: multi-language glossary vocabulary flash cards. An equation that contains an absolute value expression

Trusses - Method of Sections

Module 3, Section 4 Analytic Geometry II

Algebra/Pre-calc Review

CHAPTER 2: Partial Derivatives. 2.2 Increments and Differential

y = 3 2 x 3. The slope of this line is 3 and its y-intercept is (0, 3). For every two units to the right, the line rises three units vertically.

8.7 Systems of Non-Linear Equations and Inequalities

FIRST- AND SECOND-ORDER IVPS The problem given in (1) is also called an nth-order initial-value problem. For example, Solve: Solve:

Transcription:

Illinois Institute of Technolog epartment of omputer Science Spla Trees S 535 esign and nalsis of lgorithms Fall Semester, 2018 Spla trees are a powerful form of (leicographic) balanced binar trees devised b Sleator and Tarjan [1]. Spla trees are self-adjusting, so that frequentl-accessed items drift toward the root. Ever time we access the tree, we reorganie it through a sequence of rotations this organiation will be epensive on a onetime basis, but analsis shows that the basic operations require O(log n) amortied time: we show that m consecutive operations on a spla tree with n nodes require amortied time O((m+n)logn+m). Moreover, spla trees do not require an etra information in each node, as is needed for red-black trees. 1 Splaing The main operation in spla trees is the spla operation step. Splaing on a node moves to the root of the tree b a sequence of rotations that move it up two levels up at a time. If is an even number of levels from the root, then we use these two-level rotations to bring directl to the root. If is an odd number of levels from the root, then we these rotations to bring it up to one level below the root, at which point we appl either a right rotation (called a ZIG) or a left-rotation (called a ZG) to bring up to the node: ZG ZIG Thus, if our node is a left child we ZIG it to the root. Similarl, if it is a right child, we ZG it from the right to the root. To move the node two levels up the tree, consider its position relative to its grandparent. e appl either a ZIG-ZIG(that is, two ZIG s in a row) to the appropriatesubtree if is the leftmost grandchild, or a ZG-ZG if it is the rightmost grandchild. ZG-ZG ZIG-ZIG

S 535 Fall, 2018 2 Spla Trees Similarl, if is the right child of a left parent, we appl a ZIG-ZG: ZIG-ZG That is, we first rotate left at (the ZG) and then rotate right at (the ZIG). If is the left child of a right parent, we appl a ZG-ZIG operation: ZG-ZIG That is, we first rotate right at (the ZIG) and then rotate left at (the ZG). Using these operations, splaing on a node of depth d requires d rotations. e make a rotation our basic unit of time, as the can be implemented in constant time. Thus, we can sa that we need time d to spla on a node at depth d. onsider the following eample where we spla on the node 5. Note that the subtree rooted at 5 is constantl moving up the tree. 6 ZG-ZG 6 ZIG-ZG 5 2 F 2 F 2 6 3 5 4 E F 4 4 E 3 5 3 E e see that the tree goes from being long and thin list to being shorter and bushier. This is the general behavior under splaing: paths in the tree tend shorten considerabl. Pla around with the following URL

S 535 Fall, 2018 3 Spla Trees to get an idea of the power of the spla operation to keep a tree balanced as it undergoes searches, insertions, and deletions: http://www.ibr.cs.tu-bs.de/courses/ss98/audii/applets/st/splatree-eample.html 2 mortied ost of Splaing Since splaing is the primar operation, we now anale its amortied cost. To do so, we define a potential function. Suppose we have a spla tree T. Then, 1. ssign each node in T a positive weight w(). 2. Let the sie of node, denoted S(), be the sum of the weights of all the nodes in the subtree rooted at. 3. efine the rank r() = lgs(). efine the potential function of the tree T to be Φ(T) = r(). nodes T For eample, let the weight function be w() = 1 for each (internal) node (weights of leaves which are reall null pointers are 0), and consider the tree 5 2 1 4 3 The sie of node 4, for eample, is w(4)+w(3) = 1+1 = 2; thus, r(4) = lg2 = 1. e can similarl compute the rank of each internal node to get the potential of the tree: Φ(T) = r(1)+r(2)+r(3)+r(4)+r(5) = lg1+lg3+lg1+lg2+lg5 = lg30 4.91 e now use the tools developed during our stud of amortied analsis. Recall that: MORTIZE OST = TUL OST + change in potential so that ĉ i = c i +Φ(T i ) Φ(T i 1 )

S 535 Fall, 2018 4 Spla Trees and hence n ĉ i = i=1 n c i +Φ(T n ) Φ(T 0 ). i=1 here T i is the tree after the ith operation, ĉ i is the amortied cost of the ith operation (that is, what ou would charge a customer of our data-structure business) and c i is the actual cost of that operation (that is, what ou have to pa the graduate student to do the work). In other words, over a sequence of operations, MORTIZE OST (sequence) = TUL OST (sequence)+φ end Φ start or, TUL OST (sequence) = MORTIZE OST (sequence) Φ end +Φ start. (1) ecause rotations take O(1) time, we define time to mean the number of rotations needed in an operation on a tree. ccess Lemma The amortied time to spla at a node in a tree with root t is at most ( 3[r(t) r()]+1 = O 1+log S(t) ). S() The proof of this lemma is b analing the various possible steps involved in splaing. Let r ( ) be the rank of a node after the spla, and r( ) be the rank of a node before the spla. e show that the amortied cost of an step of the spla operation is at most 3[r () r()], with eception of the one etra rotation needed at the root if is initiall at odd depth. Thus, when we perform several parts of a spla operation together, the 3[r () r()] terms telescope, giving an amortied cost of amortied costs = 1+ 3[r () r()] as claimed in the lemma. e now consider the various cases. = 1+3[r final () r initial ()] = 1+3[lgS(root) lgs()] ( = O 1+log S(t) ), S() ase 1: One rotation This case occurs as the last step when the splaed node is at an odd depth from the root. The actual cost of one rotation is just one time unit (rotation). Thus, we are left to anale the change potential. For the ZIG rotation ZIG the potentials of subtrees,, and are unaffected b a ZIG because their internal node structure does not change. Thus we need be concerned onl with the change in the ranks of and. enote b r() and

S 535 Fall, 2018 5 Spla Trees r() the ranks of and, respectivel, before the ZIG; denote b r () and r () the ranks of and after the ZIG operation. The change in the potential function caused b a ZIG is Φ = r ()+r () r() r(). learl, r () r() since starts the ZIG overlooking subtrees,, and and node, but ends the ZIG overlooking onl subtrees and. Thus, we can bound the potential difference b Φ = r () r()+r () r() r () r(). Now we can bound the amortied cost, as per equation (2): MORTIZE OST = TUL OST + Φ = 1+ Φ 1+r () r() 1+3[r () r()] The last inequalit is clearl a weak statement, since the 3 is unnecessar; nevertheless, this is the inequalit needed for the telescoping mentioned above. The amortied cost of a ZG operation can be computed in the same wa, with the appropriate relabelings. ase 2: ZIG-ZIG The actual cost of a ZIG-ZIG operation is two time units, the two rotations. e now need to compute the change in the potential function. Recall the definition of a ZIG-ZIG: ZIG-ZIG s in the previous case, the potentials of the subtrees,,, and are unaffected b the operation. Thus, we have, Φ = r ()+r ()+r () r() r() r(). (2) Now, to bring this Φ to the desired form, we notice a few relationships. First, r () = r() because the rotated is precisel in the same position as the old. Moreover, r() r() because overlooks in the original tree. Similarl, r () r (). Thus, (2) becomes giving an amortied cost of Φ r ()+r () 2r(), MORTIZE OST 2+r ()+r () 2r()

S 535 Fall, 2018 6 Spla Trees e want to show that which would follow from or that is, proving that MORTIZE OST 3[r () r()], 2+r ()+r () 2r() 3[r () r()], 2r ()+r()+r ()+2 0; 2r ()+r() +r () 2 will give the claimed bound. Let us anale the lefthand side of this last inequalit 2r ()+r()+r () = r ()+r() r ()+r () = log S () S() log S () S () = log S() S () +log S () S (), (3) where S ( ) is the sie of a node after the ZIG-ZIG operation, and S( ) is the sie of a node before the ZIG-ZIG operation. efine so that (3) becomes a = S() S () and b = S () S (), lga+lgb. learl a > 0 and b > 0. Moreover, S()+S () S () because before the ZIG-ZIG overlooks subtrees and, and overlooks subtrees and, but after the ZIG-ZIG overlooks all subtrees,,, and, in addition to and. Thus, S() S () + S () S () 1 ( less than is possible because the weight of is not included), and hence a+b 1. Thus, we have a > 0, b > 0, and a+b 1. Using the conveit of the logarithm, elementar calculus tells us that in the region of interest lga+lgb reaches a maimal value of 2 for a = b = 1/2. Thus, lga+lgb 2. Substituting back we get which is what we needed to show, so 2r ()+r()+r () 2, for the ZIG-ZIG operation, as we claimed. MORTIZE OST 3[r () r()] appropriate relabeling, we see that the ZG-ZG operation has the same amortied time.

S 535 Fall, 2018 7 Spla Trees ase 3: ZIG-ZG This case is similar in fashion to the case of ZIG-ZIG that we just analed: First, recall the definition of a ZIG-ZG: ZIG-ZG s in the previous case, the actual cost of a ZIG-ZG is 2 rotations, and we have to compute the change in the potential function, Φ = r ()+r ()+r () r() r() r(). gain we note that r () = r(). Furthermore, r() r() because is above in the original tree. Thus we have MORTIZE OST = 2+r ()+r ()+r () r() r() r() s in the previous case, we wish to show that 2+r ()+r () 2r(). MORTIZE OST 3[r () r()]; so we will show that 2+r ()+r () 2r() 2[r () r()], which is less than or equal to our desired bound of 3[r () r()]. The last inequalit can be rearranged to 2r () r () r () 2. s in the previous case, we know that S ()+S () S (), so that efine and again we find lga+lgb 2 so that S () S () + S () S () 1. a = S () S () and b = S () S () 2r () r () r () 2, and the stated bound on the amortied cost of the ZIG-ZG step follows. The ZG-ZIG case follows in eactl the same fashion, proving the ccess Lemma. Note that the doublerotation steps are necessar in this calculation, because the do not carr the +1 term that the single-rotation amortied costs carr. This +1 term would destro the telescoping.

S 535 Fall, 2018 8 Spla Trees 3 alance Theorem e can now determine the actual time needed for multiple accesses to the tree. alance Theorem The total access time for m accesses of a tree with n items is: O((m+n)logn+m). The proof of this theorem follows from our potential function and the ccess Lemma, using (1). efine the weight of a node to be w() = 1/n. From the ccess Lemma, we know that the amortied cost for the m accesses is ( m O 1+log S(t) ) ( = m O 1+log 1 ) S() 1/n = m O(1+logn). The first equalit follows from the fact that t overlooks all the other nodes, so its weight is n 1/n = 1. The greatest possible starting potential is Φ start n lg1 = 0, i=1 because no verte can have sie greater than 1 (the total sie of the whole tree is 1). On the other hand, the smallest ending potential is n Φ end lg 1 n nlogn i=1 because, at worst, ever verte has its own weight of 1/n; the second inequalit comes from Stirling s approimation. Putting all of the pieces together, as the theorem states. TUL OST m O(1+logn)+0 ( nlogn) O((m+n)logn+m), 4 Static Optimalit Theorem e can refine the alance Theorem if we know something about the access frequencies of the tree nodes. Static Optimalit Theorem If each item is accessed at least once, the total time for m accesses in a tree with n nodes is n O m+ q i log m, q i where q i is the number of times item i is accessed so that n i=1 q i = m. i=1

S 535 Fall, 2018 9 Spla Trees The proof follows as in the alance Theorem, but using the weight function = q i m, (so the sum ofall weightsis 1) and observingthat the amortiedcost ofan accessofitem i is O(1+log S(t) S(i) ) = O(1+log m q i ). The Static Optimalit Theorem is amaing because it is within a constant multiple of the entrop the information-theoretic lower bound on access time for a binar tree! (This is also true for the alance Theorem.) Thus spla trees are within a constant multiple of the lower bound on the problem. Moreover, the are as good as finger trees (trees that keep fingers pointing to the most frequentl-occurring items). 5 Operations on Spla Trees ith the spla operation and its analsis, it is not too difficult to implement and anale operations such as search, insert, and delete on spla trees. First, however, we introduce three new tree operations. ccess ccess takes an item i as input, runs a search on the tree to find the node containing i, then splas on that node, moving it to the top of the tree. If no such item is found, we spla on the last non-null node that we eamined in the binar search; that is, we spla at either i or i +, the predecessor or successor of i, respectivel. Join The join operation takes two trees, T 1 and T 2, for which ever item in T 2 is greater than ever item in T 1, and returns a single tree T containing the items of both trees. Implementing join on spla trees requires accessing the largest item, which we denote b i, in T 1 b following non-null right pointers from the root, followed b a spla at the last node found, which is i, the largest item in the tree. The spla puts i at the root of T 1 and, as the largest element, must have a null right subtree, which we replace with T 2 : T 1 T 2 join i T 1 i T 2 Split This is the reverse of join. It takes a tree T and a node i as input, and creates two trees T 1 and T 2 such that all items in T 1 are smaller than (or equal to) i and all items in T 2 are greater than (or equal to) i. Implementing split on spla trees involves accessing i, and then breaking one of the root s branches, depending on whether the root is greater or less than i (arbitraril selecting one if i is the root).

S 535 Fall, 2018 10 Spla Trees T spla at i i T1 i T 1 T 2 T 2 If i is not in the tree, the root after the spla at i is either i or i +, the leicographic predecessoror successor of i, respectivel. Now it is eas to implement the familiar operations of insert and delete: Insert Takes a tree T and an item i (presumed not in the tree) as an input and inserts a node containing i into T. To perform an insert, simpl split(t,i) and then make a new tree whose left and right branches are the trees T 1 and T 2 returned from split and whose node contains the item i. T split at i T 1 T 2 connect using i i T 1 T 2 Note that insert uses the fact that the split operation works if the node i is not present in the tree. elete Takes a tree T and an item i in the tree and deletes i from T. To perform delete, we again do a split(t,i). Then remove i and join the resulting subtrees. T spla at i i join T 1 and T 2 T 1 and T 2 T 1 T 2 elete could also be done b searching for the node containing i; suppose this node is and has a parent. e then replace as a child of b joining the left and right subtrees of and then splaing on. ll these operations have a logarithmic amortied time bound. Specificall, the following table gives the amortied times for the spla tree operations as a function of, the total weight of the items in the tree(s). The variables i + and i denote, respectivel, the successor and predecessor of i in the tree. If i or i + is undefined, then w(i ) = and w(i + ) =, respectivel.

S 535 Fall, 2018 11 Spla Trees Operation mortied ost access(i,t) 3lg +1 if i is in T access(i,t) 3lg min{w(i ),w(i + )} +1 if i is not in T join(t 1,T 2 ) 3lg +O(1) where i is the last item in T 1 split(i,t) 3 lg +O(1) if i is in T split(i,t) 3lg min{w(i ),w(i + )} +O(1) if i is not in T insert(i,t) 3lg min{w(i ),w(i + )} +lg +O(1) ) delete(i,t) 3lg +3lg +O(1) ( w(i ) The bounds for access and split follow directl from the ccess Lemma. The other bounds follow b analing the change in potential. For eample, the bound on join is found as follows: first we do an access on the largest value in T 1 ; this costs at most 3lg(S(T 1 )/)+1 amortied time. The link requires onl O(1) additional work, but linking trees T 1 and T 2 increases the potential so we also have to eamine that. The onl node whose weight changes is the root of T 1, which becomes the root of the entire tree. Thus, because the total weight is = S(T 1 )+S(T 2 ), the change in potential of the new root i is at most lg lgs(t 1 ) = lg. S(T 1 ) ombining these terms gives the desired bound: S(T1 ) 3lg +O(1)+lg S(T 1 ) S(T1 ) = 2lg S(T1 ) = 2lg +lg 3lg +O(1) The bounds for insert and delete are proven in a similar manner. ( S(T1 ) +lg ( S(T 1 ) ) +O(1) ) +O(1) e warned: The constants hidden in the O notation are large for this data structure, so it ma not be practical in real life. Reference [1] aniel. Sleator and Robert E. Tarjan, Self-djusting inar Search Trees, JM, vol. 32 (1985), pp. 652 686.