Splay trees (Sleator, Tarjan 1983)

Similar documents
past balancing schemes require maintenance of balance info at all times, are aggresive use work of searches to pay for work of rebalancing

Outline for Today. Static Optimality. Splay Trees. Properties of Splay Trees. Dynamic Optimality (ITA) Balanced BSTs aren't necessarily optimal!

Illinois Institute of Technology Department of Computer Science. Splay Trees. CS 535 Design and Analysis of Algorithms Fall Semester, 2018

Lecture 2 14 February, 2007

Splay Trees. CMSC 420: Lecture 8

8: Splay Trees. Move-to-Front Heuristic. Splaying. Splay(12) CSE326 Spring April 16, Search for Pebbles: Move found item to front of list

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Lecture 1 : Data Compression and Entropy

Notes on Logarithmic Lower Bounds in the Cell Probe Model

Lecture 5: Splay Trees

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

Outline for Today. Static Optimality. Splay Trees. Properties of Splay Trees. Dynamic Optimality (ITA) Balanced BSTs aren't necessarily optimal!

Lecture 18 April 26, 2012

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

COMP9319 Web Data Compression and Search. Lecture 2: Adaptive Huffman, BWT

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

Data Compression Techniques

COMP9319 Web Data Compression and Search. Lecture 2: Adaptive Huffman, BWT

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

UNIT I INFORMATION THEORY. I k log 2

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

Computing Connected Components Given a graph G = (V; E) compute the connected components of G. Connected-Components(G) 1 for each vertex v 2 V [G] 2 d

Lecture 1: September 25, A quick reminder about random variables and convexity

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea

Chapter 6. Self-Adjusting Data Structures

Graphs and Trees Binary Search Trees AVL-Trees (a,b)-trees Splay-Trees. Search Trees. Tobias Lieber. April 14, 2008

Algorithm Design and Analysis

3 Greedy Algorithms. 3.1 An activity-selection problem

Lecture 4 : Adaptive source coding algorithms

Dynamic Length-Restricted Coding. Travis Gagie

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.

Amortized Complexity Main Idea

Converting SLP to LZ78 in almost Linear Time

Information Theory and Statistics Lecture 2: Source coding

Lecture 3 : Algorithms for source coding. September 30, 2016

CSE 5311 Notes 5: Trees. (Last updated 6/4/13 4:12 PM)

Compressed Index for Dynamic Text

Data Compression Techniques

Fibonacci Heaps These lecture slides are adapted from CLRS, Chapter 20.

Sources: The Data Compression Book, 2 nd Ed., Mark Nelson and Jean-Loup Gailly.

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

Quantum-inspired Huffman Coding

A Unified Access Bound on Comparison-Based Dynamic Dictionaries 1

CSE 421 Greedy: Huffman Codes

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Coding of memoryless sources 1/35

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

Module 5 EMBEDDED WAVELET CODING. Version 2 ECE IIT, Kharagpur

Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

Lecture 16 Oct 21, 2014

Y1 Y2 Y3 Y4 Y1 Y2 Y3 Y4 Z1 Z2 Z3 Z4

CSEP 590 Data Compression Autumn Arithmetic Coding

Entropy as a measure of surprise

Source Coding Techniques

COMMUNICATION SCIENCES AND ENGINEERING

Slides for CIS 675. Huffman Encoding, 1. Huffman Encoding, 2. Huffman Encoding, 3. Encoding 1. DPV Chapter 5, Part 2. Encoding 2

Optimal Color Range Reporting in One Dimension

A Mathematical Theory of Communication

( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1

An Approximation Algorithm for Constructing Error Detecting Prefix Codes

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes

21. Dynamic Programming III. FPTAS [Ottman/Widmayer, Kap. 7.2, 7.3, Cormen et al, Kap. 15,35.5]

CS4800: Algorithms & Data Jonathan Ullman

Static Huffman. Wrong probabilities. Adaptive Huffman. Canonical Huffman Trees. Algorithm for canonical trees. Example of a canonical tree

Lecture 7: Trees and Structural Induction

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

CS 395T Computational Learning Theory. Scribe: Mike Halcrow. x 4. x 2. x 6

Lower Bounds for Dynamic Connectivity (2004; Pǎtraşcu, Demaine)

CMPT 365 Multimedia Systems. Final Review - 1

Adapting Boyer-Moore-Like Algorithms for Searching Huffman Encoded Texts

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

arxiv: v2 [cs.ds] 30 Jun 2015

Splay Trees, Davenport-Schinzel Sequences, and the Deque Conjecture

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

Skip-Splay: Toward Achieving the Unified Bound in the BST Model

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Advanced Implementations of Tables: Balanced Search Trees and Hashing

arxiv: v1 [cs.ds] 15 Feb 2012

arxiv: v1 [cs.ds] 25 Nov 2009

Disjoint Sets : ADT. Disjoint-Set : Find-Set

Digital communication system. Shannon s separation principle

FIBONACCI HEAPS. preliminaries insert extract the minimum decrease key bounding the rank meld and delete. Copyright 2013 Kevin Wayne

Greedy Algorithms and Data Compression. Curs 2018

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

CMPT 365 Multimedia Systems. Lossless Compression

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility

Greedy. Outline CS141. Stefano Lonardi, UCR 1. Activity selection Fractional knapsack Huffman encoding Later:

COMM901 Source Coding and Compression. Quiz 1

CS Data Structures and Algorithm Analysis

Alphabet Friendly FM Index

Lecture 1: Shannon s Theorem

arxiv: v2 [cs.ds] 8 Apr 2016

2018/5/3. YU Xiangyu

Transcription:

Splay trees (Sleator, Tarjan 1983) 1

Main idea Try to arrange so frequently used items are near the root We shall assume that there is an item in every node including internal nodes. We can change this assumption so that items are at the leaves. 4

First attempt Move the accessed item to the root by doing rotations y x x <===> y 5

Move to root (example) e e e d F d F d F c E c E a E b a c b a D b D D 6

Move to root (analysis) There are arbitrary long access sequences such that the time per access is Ω(n)! 7

8

Splaying Does rotations bottom up on the access path, but rotations are done in pairs in a way that depends on the structure of the path. splay step: (1) zig - zig y z D ==> x y x z D 9

Splaying (cont) (2) zig - zag y z D ==> y x z x D (3) zig x y ==> x y 10

11 Splaying (example) i h H g f e d c b a I J G D E F i h H g f e d a b c I J G F E D ==> i h H g f a d e b c I J G F E D ==>

12 Splaying (example cont) i h H g f a d e b c I J G F E D ==> i h H a f g d e b c I J G F E D ==> f d b c E D a h i I J H g e G F

13

14

Splaying (analysis) ssume each item i has a positive weight w(i) which is arbitrary but fixed. Define the size s(x) of a node x in the tree as the sum of the weights of the items in its subtree. The rank of x: r(x) = log 2 (s(x)) Measure the splay time by the number of rotations 15

ccess lemma The amortized time to splay a node x in a tree with root t is at most 3(r(t) - r(x)) + 1 = 3log(s(t)/s(x)) + 1 Potential is the sum of the ranks: r(x) = log 2 (s(x)) This has many consequences: 16

alance theorem alance Theorem: ccessing m items in an n node splay tree takes O((m+n) log n) Proof: 17

Static optimality theorem For any item i let q(i) be the total number of time i is accessed Static optimality theorem: If every item is accessed at least once then the total access time is O(m + q(i) log (m/q(i)) ) i=1 n Optimal average access time up to a constant factor. 19

Proof of the access lemma The amortized time to splay a node x in a tree with root t is at most 3(r(t) - r(x)) + 1 = 3log(s(t)/s(x)) + 1 proof. 22

Intuition 9 8 7 6 5 4 3 1 2 log( n) log( n 1)...log(1) O( nlog n) 23

Intuition (ont) ( n 1) ( n 1) ( n 1) log(1) log(3) log(7)... log( n) 2 4 8 i ( n 1) O( n) i 2 i 24

Intuition 9 9 8 8 7 7 6 6 5 5 3 4 1 4 = 0 1 2 2 3 25

9 9 8 8 7 7 6 6 1 4 2 5 3 1 2 4 3 5 = log(5) log(3) + log(1) log(5) = -log(3) 26

9 9 8 8 1 2 7 6 4 3 5 2 1 4 3 6 5 7 = log(7) log(5) + log(1) log(7) = -log(5) 27

2 1 4 9 8 6 7 5 2 4 1 3 8 6 7 5 9 3 = log(9) log(7) + log(1) log(9) = -log(7) 28

Proof of the access lemma onsider a splay step. Let s and s, r and r denote the size and the rank function just before and just after the step, respectively. We show that the amortized time of a zig step is at most 3(r (x) - r(x)) + 1=3ΔΦ(x)+1, and that the amortized time of a zig-zig or a zig-zag step is at most 3(r (x)-r(x))= 3ΔΦ(x) The lemma then follows by summing up the cost of all splay steps 29

Proof of the access lemma (cont) (3) zig x y ==> x y amortized time(zig) = 38

Proof of the access lemma (cont) (1) zig - zig y z D ==> x y x z D amortized time(zig-zig) = 39

Proof of the access lemma (cont) (2) zig - zag y z D ==> y x z x D amortized time(zig-zag) = 40

Static finger theorem Suppose all items are numbered from 1 to n in symmetric order. Let the sequence of accessed items be i 1,i 2,...,i m Static finger theorem: Let f be an arbitrary fixed item, the total access time is O(nlog(n) + m + log( i j -f + 1)) j=1 Splay trees support access within the vicinity of any fixed finger as good as finger search trees. m 41

Working set theorem Let t(j), j=1,,m, denote the # of different items accessed since the last access to item j or since the beginning of the sequence. Working set theorem: The total access time is m O nlog( n) m log( t( j) 1) j 1 Proof: 42

Working set theorem Let t(j), j=1,,m, denote the # of different items accessed since the last access to item j or since the beginning of the sequence. Working set theorem: The total access time is m O nlog( n) m log( t( j) 1) j 1 Proof: ssign weights 1, ¼, 1/9,., 1/k 2 in the order of first access. fter an access say to an item of weight k change weights so that the accessed item has weight 1, and an item that had weight 1/d 2 has weight 1/(d+1) 2 for every d < k. Weight changes after a splay only decrease potential. Potential is nonpositive and not smaller than -2nlog(n) 43

pplication: Data ompression via Splay Trees Suppose we want to compress text over some alphabet Prepare a binary tree containing the items of at its leaves. To encode a symbol x: Traverse the path from the root to x spitting 0 when you go left and 1 when you go right. Splay at the parent of x and use the new tree to encode the next symbol 44

ompression via splay trees (example) a b a b c d e f g h c d aabg... e f g h 000 45

ompression via splay trees (example) a b a b c d e f g h c d aabg... e f g h 0000 46

ompression via splay trees (example) a b a b c d e f g h c d e f g h aabg... 000010 47

ompression via splay trees (example) a b a b c d e f g h c d e f g h aabg... 0000101110 48

Decoding Symmetric. The decoder and the encoder must agree on the initial tree. 49

ompression via splay trees (analysis) How compact is this compression? Suppose m is the # of characters in the original string The length of the string we produce is m + (cost of splays) by the static optimality theorem m + O(m + q(i) log (m/q(i)) ) = O(m + q(i) log (m/q(i)) ) Recall that the entropy of the sequence q(i) log (m/q(i)) is a lower bound. 50

ompression via splay trees (analysis) In particular the Huffman code of the sequence is at least q(i) log (m/q(i)) ut to construct it you need to know the frequencies in advance 51

ompression via splay trees (variations) D. Jones (88) showed that this technique could be competitive with dynamic Huffman coding (Vitter 87) Used a variant of splaying called semi-splaying. 52

53 Semi - splaying z y x D y z D ==> Semi-splay zig - zig x z y x D x y z D == > Regular zig - zig * * * * ontinue splay at y rather than at x.

Update operations on splay trees atenate(t1,t2): Splay T1 at its largest item, say i. ttach T2 as the right child of the root. i i T1 T2 T2 T1 mortize time: 3(log(s(T1)/s(i)) + 1 + 3log(W/w(i)) + O(1) T1 T2 s(t1) + s(t2) log( ) s(t1) 59

Update operations on splay trees (cont) split(i,t): ssume i T Splay at i. Return the two trees formed by cutting off the right son of i i i T T1 T2 mortized time = 3log(W/w(i)) + O(1) 60

Update operations on splay trees (cont) split(i,t): What if i T? Splay at the successor or predecessor of i (i- or i+). Return the two trees formed by cutting off the right son of i or the left son of i i- i- T T1 T2 mortized time = 3log(W/min{w(i-),w(i+)}) + O(1) 61

Update operations on splay trees (cont) insert(i,t): Perform split(i,t) ==> T1,T2 Return the tree i T1 T2 mortize time: W-w(i) 3log( ) min{w(i-),w(i+)} + log(w/w(i)) + O(1) 62

Update operations on splay trees (cont) delete(i,t): Splay at i and then return the catenation of the left and right subtrees T1 i T2 T1 + T2 mortize time: 3log(W/w(i)) + 3log( W-w(i) w(i-) ) + O(1) 63

Open problems Dynamic optimality conjecture: onsider any sequence of successful accesses on an n-node search tree. Let be any algorithm that carries out each access by traversing the path from the root to the node containing the accessed item, at the cost of one plus the depth of the node containing the item, and that between accesses perform rotations anywhere in the tree, at a cost of one per rotation. Then the total time to perform all these accesses by splaying is no more than O(n) plus a constant times the cost of algorithm. 65

Open problems Dynamic finger conjecture (now theorem) The total time to perform m successful accesses on an arbitrary n- node splay tree is m O(m + n + (log i j+1 - i j + 1)) j=1 where the j th access is to item i j Very complicated proof showed up in SIOMP recently (ole et al) 66