CS 151. Red Black Trees & Structural Induction. Thursday, November 1, 12

CS 151 Red Black Trees & Structural Induction 1

Announcements Majors fair tonight 4:30-6:30pm in the Root Room in Carnegie. Come and find out about the CS major, or some other major. Winter Term in CS info session will happen in the next week. Details TBA. 2 2

Red-Black Trees A red-black tree is another balanced BST, faster than AVL s in practice. A red-black tree is a binary search tree such that color property: every node is either red or black root property: the root node is black internal property: the children of a red node are black depth property: for each node v, all v -> null pointer paths contain the same number of black nodes 1 13 8 17 11 15 25 6 22 27 3 3

Structural Induction Review Structural induction is a proof technique used to prove boolean properties. That is, it s used to prove that a given boolean method always returns true. To use structural induction, you need a recursively defined structure S with both a base case, and a recursive case a boolean property / method P(S) defined on any instance of structure S To prove that P(S) returns / is true for all S that satisfy your definition, just prove that P(S) is true when S is one of the base cases, and prove that P(S) is true in the recursive case IF it s true for the substructures 4 4

Structural Induction Example A red-black tree T is either an empty tree, or a black root node r with left and right RB trees that may violate the root property (so root must be black if either child is red, otherwise no restrict n) For a RB tree T that possibly violates the root property, let P(T) be the boolean property that b(t ) 2 m(t ) 1. (Remember, you can think of P(T) as a boolean function that takes in a RB tree T that possibly violates the root property, and returns the value of the given boolean expression.) We will show that P(T) is always true, that is, that no matter what tree you pass to the method P(T), its expression always evaluates to true. This fact helps show that RB trees have height O(log n). (It isn t immediately obvious, but it helps nonetheless.) 5 5

Structural Induction Example For a RB tree T that possibly violates the root property, let P(T) be the boolean property that b(t ) 2 m(t ) 1. We will show that P(T) is true for all RB trees T by structural induction on T. Step 1: Show that P(T) is true when T is your base case. base case: T is an empty tree We want to show (WTS) that P(T) is true when T is an empty tree, i.e. that b(t ) 2 m(t ) 1 holds when T is an empty tree. When T is empty, b(t)=0 and m(t)=0. Then b(t )=0 because T has no nodes, red or black =1 1 by algebra =2 0 1 also by algebra =2 m(t ) 1 because there are no nodes on any root-leaf path So the base case is true, because P(T) is true when T is an empty RB tree. 7 7

Structural Induction Example For a RB tree T that possibly violates the root property, let P(T) be the boolean property that b(t ) 2 m(t ) 1. We will show that P(T) is true for all RB trees T by structural induction on T. Step 2: Show that P(T) is true when T is your recursive case. inductive step: T is a root r with subtrees L and R that may violate root prop. Suppose, hypothetically, that P(L) and P(R) are true, i.e. that b(l) 2 m(l) 1 and b(r) 2 m(r) 1 called the induction hypothesis We want to use these two hypothetical assumptions to show P(T) is true, ie. that b(t ) 2 m(t ) 1 is always true. To do this, we need to find a connection between b(t) and b(l), b(r) m(t) and m(l), m(r) 8 8

Structural Induction Example Algebraically connect b(t) to b(l), b(r) Algebraically connect m(t) to m(l), m(r) Let x(t) = 1 if the root of T is black, and let x(t) = 0 if it is red. Then b(t) = b(l) + b(r) + x(t) (because T contains black nodes of R and L) m(t) = m(l) + x(t) = m(r) + x(t) (because of the RB tree depth property) Recall that we want to show that b(t ) 2 m(t ) 1 is true, given our IH that b(l) 2 m(l) 1 and b(r) 2 m(r) 1 b(t )=b(l)+b(r)+x(t) because T contains nodes of R, L, and itself 2 m(l) 1+2 m(r) 1+x(T ) by our induction hypothesis (IH) =2 m(l)+1 2+x(T ) because m(l)=m(r), and by algebra Now there are 2 cases. If x(t)=0 then b(t ) 2 m(t )+1 2 2 m(t ) 1 If x(t)=1 then b(t ) 2 m(t ) 2+1=2 m(t ) 1 In both cases, we ve shown our inequality holds. The proof is done! 9 9

Red-Black Trees Heights are log n Fact: for every RB tree T, the length of the longest root-leaf path (i.e. T s height) is at most (twice that of the shortest root-leaf path + 1). (This fact also helps show that RB trees heights are O(log n).) Proof: Let S be the shortest root-leaf path, let L be the longest root-leaf path. WTS (want to show) that length(l) <= 2 length(s) + 1 Equivalently, WTS that numnodes(l) <= 2 numnodes(s) (by def of length) By the internal property, no path has 2 red nodes in a row, so numred(l) <= numblack(l), equivalently, numnodes(l) <= 2 numblack(l) By the depth property, numblack(l) = numblack(s) Putting this together: length(l) = numnodes(l) - 1 <= 2 numblack(l) - 1 = 2 numblack(s) - 1 <= 2 numnodes(s) - 1 = 2( numnodes(s) - 1 ) + 1 = 2 length(s) + 1 10 10

Red-Black Trees Heights are log n We already showed by induction that, for every RB tree T, b(t ) 2 m(t ) 1 On the previous slide we showed that Then: numnodes(t ) b(t ) 2 m(t ) 1 2 m(t ) numnodes(t )+1 by algebra m(t ) log(numnodes(t ) + 1) height(t ) 2 log(numnodes(t ) + 1) + 1 height(t ) 2m(T )+1 because numnodes = numred + numblack by the property we showed by induction by taking the log of both sides by property from previous slide height(t ) O(log(numNodes(T ))) by definition of big-oh 11 11