CSE 5311 Notes 5: Trees. (Last updated 6/4/13 4:12 PM)

Similar documents
Illinois Institute of Technology Department of Computer Science. Splay Trees. CS 535 Design and Analysis of Algorithms Fall Semester, 2018

AVL trees. AVL trees

Chapter 6. Self-Adjusting Data Structures

Outline for Today. Static Optimality. Splay Trees. Properties of Splay Trees. Dynamic Optimality (ITA) Balanced BSTs aren't necessarily optimal!

A STUDY ON SPLAY TREES

Demonstrate solution methods for systems of linear equations. Show that a system of equations can be represented in matrix-vector form.

Splay trees (Sleator, Tarjan 1983)

Simplification of State Machines

8. BOOLEAN ALGEBRAS x x

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

Data Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

Lecture 2 14 February, 2007

Hamiltonicity and Fault Tolerance

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

Problem. Maintain the above set so as to support certain

Chapter #6 EEE State Space Analysis and Controller Design

Roberto s Notes on Integral Calculus Chapter 3: Basics of differential equations Section 3. Separable ODE s

2 Push and Pull Source Sink Push

Next: Pushdown automata. Pushdown Automata. Informal Description PDA (1) Informal Description PDA (2) Formal Description PDA

. This is the Basic Chain Rule. x dt y dt z dt Chain Rule in this context.

Multi-Objective Weighted Sampling

Integer Sorting on the word-ram

Handout for Adequacy of Solutions Chapter SET ONE The solution to Make a small change in the right hand side vector of the equations

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression

Chapter Adequacy of Solutions

Disjoint Sets : ADT. Disjoint-Set : Find-Set

CSE 5311 Notes 2: Binary Search Trees

The Maze Generation Problem is NP-complete

Note that r = 0 gives the simple principle of induction. Also it can be shown that the principle of strong induction follows from simple induction.

INF Anne Solberg One of the most challenging topics in image analysis is recognizing a specific object in an image.

Approximation: Theory and Algorithms

The Steiner Ratio for Obstacle-Avoiding Rectilinear Steiner Trees

Advanced Data Structures

Design and Analysis of Algorithms

Solutions to Two Interesting Problems

Lecture VIII Keywords: tree-based data structures; binary search trees; red/black trees; flexible functional arrays; heaps; priority queues.

1.1 Laws of exponents Conversion between exponents and logarithms Logarithm laws Exponential and logarithmic equations 10

UNCORRECTED SAMPLE PAGES. 3Quadratics. Chapter 3. Objectives

Point Equilibrium & Truss Analysis

Lecture #8: We now take up the concept of dynamic programming again using examples.

Linear regression Class 25, Jeremy Orloff and Jonathan Bloom

Problem Set 4 Solutions

Uncertainty and Parameter Space Analysis in Visualization -

Strain Transformation and Rosette Gage Theory

Language and Statistics II

A General Lower Bound on the I/O-Complexity of Comparison-based Algorithms

Advanced Data Structures

past balancing schemes require maintenance of balance info at all times, are aggresive use work of searches to pay for work of rebalancing

Outline. Computer Science 331. Cost of Binary Search Tree Operations. Bounds on Height: Worst- and Average-Case

Review of Essential Skills and Knowledge

MAT 1275: Introduction to Mathematical Analysis. Graphs and Simplest Equations for Basic Trigonometric Functions. y=sin( x) Function

CS 395T Computational Learning Theory. Scribe: Mike Halcrow. x 4. x 2. x 6

8: Splay Trees. Move-to-Front Heuristic. Splaying. Splay(12) CSE326 Spring April 16, Search for Pebbles: Move found item to front of list

Problem: Data base too big to fit memory Disk reads are slow. Example: 1,000,000 records on disk Binary search might take 20 disk reads

Trusses - Method of Sections

Exploiting Correlation in Stochastic Circuit Design

Copyright 2000, Kevin Wayne 1

7 Dictionary. 7.1 Binary Search Trees. Binary Search Trees: Searching. 7.1 Binary Search Trees

y R T However, the calculations are easier, when carried out using the polar set of co-ordinates ϕ,r. The relations between the co-ordinates are:

Fundamental Algorithms

a. plotting points in Cartesian coordinates (Grade 9 and 10), b. using a graphing calculator such as the TI-83 Graphing Calculator,

THE KENNESAW STATE UNIVERSITY HIGH SCHOOL MATHEMATICS COMPETITION PART I MULTIPLE CHOICE NO CALCULATORS 90 MINUTES

13 Dynamic Programming (3) Optimal Binary Search Trees Subset Sums & Knapsacks

CIS Spring 2018 (instructor Val Tannen)

2 Generating Functions

Number Plane Graphs and Coordinate Geometry

Let s try an example of Unit Analysis. Your friend gives you this formula: x=at. You have to figure out if it s right using Unit Analysis.

Binary Search Trees. Dictionary Operations: Additional operations: find(key) insert(key, value) erase(key)

Applications of Gauss-Radau and Gauss-Lobatto Numerical Integrations Over a Four Node Quadrilateral Finite Element

CONTINUOUS SPATIAL DATA ANALYSIS

Systems of Linear Equations: Solving by Graphing

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October

Dictionary: an abstract data type

COMPUTING π(x): THE MEISSEL, LEHMER, LAGARIAS, MILLER, ODLYZKO METHOD

The Cell Probe Complexity of Dynamic Range Counting

1 ode.mcd. Find solution to ODE dy/dx=f(x,y). Instructor: Nam Sun Wang

CIRCUIT COMPLEXITY AND PROBLEM STRUCTURE IN HAMMING SPACE

An Information Theory For Preferences

INTRODUCTION TO DIOPHANTINE EQUATIONS

Dictionary: an abstract data type

Computing Connected Components Given a graph G = (V; E) compute the connected components of G. Connected-Components(G) 1 for each vertex v 2 V [G] 2 d

Nova Scotia Examinations Mathematics 12 Web Sample 2. Student Booklet

Trusses - Method of Joints

Optimal Control of Hybrid Systems Using Statistical Learning

Boolean Algebra and Logic Gates

RELATIONS AND FUNCTIONS through

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Part III. Data Structures. Abstract Data Type. Dynamic Set Operations. Dynamic Set Operations

The first change comes in how we associate operators with classical observables. In one dimension, we had. p p ˆ

Chapter R REVIEW OF BASIC CONCEPTS. Section R.1: Sets

EPIPOLAR GEOMETRY WITH MANY DETAILS

Lecture 15 April 9, 2007

Shannon-Fano-Elias coding

A Randomized Algorithm for Two Servers in Cross Polytope Spaces

Chapter 6: Securing neighbor discovery

arxiv: v1 [cs.ds] 15 Feb 2012

Covariance and Correlation Class 7, Jeremy Orloff and Jonathan Bloom

And similarly in the other directions, so the overall result is expressed compactly as,

CS Data Structures and Algorithm Analysis

Transcription:

SE 511 Notes 5: Trees (Last updated 6//1 :1 PM) What is the optimal wa to organie a static tree for searching? n optimal (static) binar search tree is significantl more complicated to construct than an optimal list. 1. ssume access probabilities are known: kes are K 1 < K < < K n pi = probabilit of request for Ki qi = probabilit of request with Ki < request < Ki+1 q 0 = probabilit of request < K 1 qn = probabilit of request > Kn. ssume that levels are numbered with root at level 0. Minimie the epected number of comparisons to complete a search:. Eample tree: n n p j ( KeLevel( j) +1) + q j MissLevel j j=1 j=0 0 K 1p 1 K1 p1 K p q0 q1 K K5 p p5 q q q q5. Solution is b dnamic programming: Principle of optimalit - solution is not optimal unless the subtrees are optimal. ase case - empt tree, costs nothing to search.

pi+1 pj qi qj qi+1 qj-1 c( i, j) cost of subtree with kes K i+1,,k j c( i, j) alwas includes eactl p i+1,, p j and q i,,q j c( i,i) = 0 ase case, no kes, just misses for q i (request between K i and K i+1 ) Recurrence for finding optimal subtree: c( i, j) = w( i, j) + min ( c( i,k 1) + c( k, j) ) i<k j tries ever possible root ( k ) for the subtree with kes K i+1,,k j w( i, j) = p i+1 + + p j + q i + + q j accounts for adding another probe for all kes in subtree : Left: pi+1 + + pk 1 + qi + qk 1 Right: p k+1 + + p j + q k + q j Root: p k Kk c(i,k-1) c(k,j) 5. Implementation: k-famil is all cases for c i,i + k. k-families are computed in ascending order from 1 to n. Suppose n = 5:

_0 1 _ 5 c( 0,0) c( 0,1) c( 0,) c( 0,) c 0, c( 1,1 ) c( 1, ) c( 1, ) c( 1, ) c( 1,5 ) c(,) c(,) c(,) c(,5) c(,) c(, ) c(,5) c(,) c(,5) c 5,5 c( 0,5) ompleit: O n space is obvious. O n time from: n k=1 k( n +1 k) where k is the number of roots for each c i,i + k in famil k. and n +1 k is the number of c( i,i + k) cases 6. Traceback - besides having the minimum value for each c( i, j), it is necessar to save the subscript for the optimal root for c i, j as r[i][j]. This also leads to Knuth s improvement: must have a ke with subscript no less than the ke and no greater than the ke subscript for the Theorem: The root for the optimal tree c i, j subscript for the root of the optimal tree for c i, j 1 root of optimal tree c( i +1, j). (These roots are computed in the preceding famil.) Proof: 1. onsider adding p j and q j to tree for c( i, j 1). Optimal tree for c i, j at the root or use one further to the right. must keep the same ke Ki+1 Kj-1. onsider adding p i+1 and q i to tree for c( i +1, j). Optimal tree for c i, j ke at the root or use one further to the left. must keep the same Ki+ Kj

7. nalsis of Knuth s improvement. case for k-famil will var in the number of roots to tr, but overall time is reduced to Each c i, j O n b using a telescoping sum: n n k n ( r[ i +1] [ i + k] r[ i] [ i + k 1] +1) = k= i=0 k= n = ( r[ n k +1] [ n] r[ 0] [ k 1] + n k +1) k= n n ( n 0 + n k +1) = ( n k +1) = O n k= k= r[ 1] [ k] r[ 0] [ k 1] +1 + r[ ] [ 1+ k] r[ 1] [ k] +1 + r[ ] [ + k] r[ ] [ 1+ k] +1 + + r[ n k +1] [ n] r[ n k] [ n 1] +1 n=7; q[0]=0.06; p[1]=0.0; q[1]=0.06; p[]=0.06; q[]=0.06; p[]=0.08; q[]=0.06; p[]=0.0; q[]=0.05; p[5]=0.10; q[5]=0.05; p[6]=0.1; q[6]=0.05; p[7]=0.1; q[7]=0.05; for (i=1;i<=n;i++) ke[i]=i;

5 w[0][0]=0.060000 w[0][1]=0.160000 w[0][]=0.80000 w[0][]=0.0000 w[0][]=0.90000 w[0][5]=0.60000 w[0][6]=0.810000 w[0][7]=1.000000 w[1][1]=0.060000 w[1][]=0.180000 w[1][]=0.0000 w[1][]=0.90000 w[1][5]=0.50000 w[1][6]=0.710000 w[1][7]=0.900000 w[][]=0.060000 w[][]=0.00000 w[][]=0.70000 w[][5]=0.0000 w[][6]=0.590000 w[][7]=0.780000 w[][]=0.060000 w[][]=0.10000 w[][5]=0.80000 w[][6]=0.50000 w[][7]=0.60000 w[][]=0.050000 w[][5]=0.00000 w[][6]=0.70000 w[][7]=0.560000 w[5][5]=0.050000 w[5][6]=0.0000 w[5][7]=0.10000 w[6][6]=0.050000 w[6][7]=0.0000 w[7][7]=0.050000 ounts - root trick without root trick 77 verage probe length is.680000 trees in parenthesied prefi c(0,0) cost 0.000000 c(1,1) cost 0.000000 c(,) cost 0.000000 c(,) cost 0.000000 c(,) cost 0.000000 c(5,5) cost 0.000000 c(6,6) cost 0.000000 c(7,7) cost 0.000000 c(0,1) cost 0.160000 1 c(1,) cost 0.180000 c(,) cost 0.00000 c(,) cost 0.10000 c(,5) cost 0.00000 5 c(5,6) cost 0.0000 6 c(6,7) cost 0.0000 7 c(0,) cost 0.0000 (1,) c(1,) cost 0.500000 (,) c(,) cost 0.00000 (,) c(,5) cost 0.10000 5(,) c(,6) cost 0.570000 6(5,) c(5,7) cost 0.60000 7(6,) c(0,) cost 0.780000 (1,) c(1,) cost 0.700000 (,) c(,5) cost 0.80000 (,5) c(,6) cost 0.800000 5(,6) c(,7) cost 1.000000 6(5,7) c(0,) cost 1.050000 (1,(,)) c(1,5) cost 1.10000 (,5(,)) c(,6) cost 1.10000 5((,),6) c(,7) cost 1.90000 6(5(,),7) c(0,5) cost 1.90000 ((1,),5(,)) c(1,6) cost 1.60000 5((,),6) c(,7) cost 1.810000 5((,),7(6,)) c(0,6) cost.050000 ((1,),5(,6)) c(1,7) cost.0000 5((,),7(6,)) c(0,7) cost.680000 5((1,(,)),7(6,)) : c(0,) + c(,7) + w[0][7] 0. 1.9 1.0 =.7 : c(0,) + c(,7) + w[0][7] 0.78 1.0 1.0 =.78 5: c(0,) + c(5,7) + w[0][7] 1.05 0.6 1.0 =.68 1 5 6 7

SPLY TREES 6 Self-adjusting counterpart to VL and red-black trees dvantages - 1) no balance bits, ) some help with localit of reference, ) amortied compleit is same as VL and red-black trees isadvantage - worst-case for operation is O(n) lgorithms are based on rotations to spla the last node processed () to root position. Zig-Zig: 1. Single right rotation at.. Single right rotation at. (+ smmetric case) Zig-Zag: ouble right rotation at. (+ smmetric case) Zig: pplies ONLY at the root Single right rotation at. (+ smmetric case) Insertion: ttach new leaf and then spla to root.

7 eletion: 1. ccess node to delete, including spla to root.. ccess predecessor in left subtree and spla to root of left subtree.. Take right subtree of and make it the right subtree of. mortied nalsis of Splaing for Retrieval (aside): ctual cost (rotations) is for ig-ig and ig-ag, but 1 for ig. S() = number of nodes in subtree with as root ( sie ) r() = lg S() ( rank ) = r Φ T T Eamples:.=lg 5 E.=lg 5 1.58=lg E =lg 1.58=lg Φ=.9 1=lg Φ=6.9 Now suppose that the leaf in the second eample is retrieved. Two ig-igs occur.

E.=lg 5.=lg 5 8 =lg =lg 1.58=lg Φ=6.9 1=lg E 1=lg Φ=5. ˆ i = i + Φ( fter) Φ( efore) = + 5. 6.9 =. 1+ lg n = 7.96 nother eample of splaing. There will be a ig-ag and a ig. E.17=lg 9 E.17=lg 9.17=lg 9 =lg H F 1=lg 1=lg G Φ=9.17 =lg I =lg H F 1=lg 1=lg G I Φ=9.17 =lg 1=lg Φ=9.98 E F 1=lg.81=lg 7 H G =lg I ˆ i = i + Φ fter Φ( efore) = + 9.98 9.17 =.81 1+ lg n =10.51 ompute amortied compleit of individual steps and then the complete splaing sequence: Lemma: If α > 0, β > 0, α + β 1, then lgα + lgβ. Proof: lgα + lgβ = lgαβ. αβ is maimied when α = β = 1, so ma lgα + lgβ =. ccess Lemma: Suppose 1) is node being splaed ) subtree rooted b has and r i 1 before ith step and r i after ith step S i 1 S i then ˆ i r i r i 1, ecept last step which has ˆ i 1+ r i r i 1.

Proof: Proceeds b considering each of the three cases for splaing: 9 Zig-Zig: ˆ i = i + Φ T i = + r i = + ri + r i (*) + ri Let α = S i 1 S i Φ( T i 1 ) + r i + r i ( ) r i 1 r i 1 r i 1 ( ) Potential changes onl in this subtree + ri ( ) ri 1 ri 1 ri = ri 1 ( ) + r i ( ) r i 1 r i 1 r i 1 + ri ( ) ri 1 ri ri, β = S i( ) S i, α > 0, β > 0, α + β = S i 1 S i +S i 1. ( is absent from numerator) Lemma conditions are satisfied, so lgα + lgβ. ppling logs to α and β gives: ri 1 + ri ( ) ri which ma be rearranged as: 0 ri ri 1 ri ( ) dd this to (*) to obtain: ˆ i r i r i 1 Zig-Zag:

ˆ i = i + Φ T i = + r i = + r i (**) + r i Φ( T i 1 ) + r i + r i ( ) r i 1 r i 1 r i 1 ( ) Potential changes onl in this subtree + r i ( ) r i 1 r i 1 r i = r i 1 ( ) + r i ( ) r i 1 r i 1 r i 1 Lemma ma be applied b observing that S i + S i ( ) S i and thus + S i( ) S i 1 S i S i lemma, lg S i S i + lg Si S i + r i ( ) r i + r i ( ) r i which can substitute into (**) r i r i ˆ i + ri = r i r i ri 1 r i 1 r i 1 Since r i 1 r i 10 Zig: ˆ i = i + Φ Ti =1+ ri =1+ r i 1+ r i 1+ r i Φ( Ti 1 ) + ri ri 1 ri 1 Potential changes onl in this subtree r i 1 r i 1 = r i r i 1 r i r i r i 1 r i 1 r i

ound on total amortied cost for an entire spla sequence: 11 m m 1 ˆ i = ˆ i + ˆ m i=1 i=1 m 1 r i i=1 = r m 1 ( r i 1 ) =1+ r m 1+ r m +1+ r m r m 1 r 0 +1+ r m r m 1 r 0 =1+ lgn since is the root after the final rotation sides: If each node is assigned a positive weight and the sie of node, S(), is the sum of the weights in the subtree, other results ma be shown such as: Static Optimalit: Spla trees (online) perform within a constant factor of the optimal (static) binar search tree (offline) for a sequence of requests. ut the elusive goal remains: namic Optimalit onjecture: Spla trees (online) perform within a constant factor of the optimal (dnamic) binar search tree (offline) for a sequence of requests. In principle, this ma be attacked like MTF is dnamicall optimal. Man difficulties arise... Potential Function: Minimum number of rotations to transform a ST into another ST? Transformation: Suppose a ST with n nodes has kes 1... n. onstruct the regular (n+)-gon with vertices 0, 1,,..., n+1. If the ST has a subtree with root i, minimum ke min( i), and maimum ke ma( i), then include edges { i,min( i) 1} and { i,ma( i) +1}. lso, include edge { 0,n +1}. These edges give a triangulated polgon. ( bound of Ο loglog n as been shown: http://erikdemaine.org/papers/tango_fos00/)

1 ST rotation corresponds to flipping the diagonal of a quadrilateral: 1 5 1 5 5 1 0 1 5 6 0 1 5 6 0 1 5 6 Rotate Left Rotate Right Ma also rotate the original ST left at 1 and right at 5.

SE 511 Notes 6: Skip Lists (Last updated 6//1 :1 PM) randomied data structure with performance characteristics similar to treaps, i.e. another alternative to balanced binar search trees. Paper available at ftp://ftp.cs.umd.edu/pub/skiplists/skiplists.pdf. dvantages: Simple algorithms, especiall for ordered retrieval Highl likel to perform as well as balanced trees. Less space: isadvantages: No balance bits. an be tuned to use fewer pointers per node (on average). oesn t handle etreme changes in number of records well. ad worse case (similar to hashing), but depends on random generator rather than data. urden on memor management due to varing sie nodes.

esign ecision : Probabilit 0 < p 0.5 (0.5 tpical) for determining number of pointers in a new node: p k 1 = probabilit of a node having at least k pointers 1 = Epected number of pointers (used) in each node. (geometric sum) 1 p haracteristics: Low p Fewer pointers in structure Increased search time (alwas epected to be logarithmic). High p More pointers in structure ecreased search time