Lecture 7: Dynamic Programming I: Optimal BSTs

Similar documents
/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17

CS473 - Algorithms I

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Lecture 6 September 21, 2016

Chapter 8 Dynamic Programming

Algorithms and Theory of Computation. Lecture 9: Dynamic Programming

Partha Sarathi Mandal

Chapter 8 Dynamic Programming

Introduction to Algorithms 6.046J/18.401J/SMA5503

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

Dynamic Programming: Shortest Paths and DFA to Reg Exps

Algorithm Analysis Recurrence Relation. Chung-Ang University, Jaesung Lee

Logic and Discrete Mathematics. Section 6.7 Recurrence Relations and Their Solution

Dynamic Programming: Shortest Paths and DFA to Reg Exps

MAKING A BINARY HEAP

V. Adamchik 1. Recurrences. Victor Adamchik Fall of 2005

CSE 431/531: Analysis of Algorithms. Dynamic Programming. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

Introduction. I Dynamic programming is a technique for solving optimization problems. I Key element: Decompose a problem into subproblems, solve them

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

In-Class Soln 1. CS 361, Lecture 4. Today s Outline. In-Class Soln 2

13 Dynamic Programming (3) Optimal Binary Search Trees Subset Sums & Knapsacks

Midterm 1 for CS 170

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:

1 Review of Vertex Cover

CSC236H Lecture 2. Ilir Dema. September 19, 2018

Design and Analysis of Algorithms

CS60007 Algorithm Design and Analysis 2018 Assignment 1

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3

Review Of Topics. Review: Induction

6. DYNAMIC PROGRAMMING I

MAKING A BINARY HEAP

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101

CSE 101. Algorithm Design and Analysis Miles Jones Office 4208 CSE Building Lecture 20: Dynamic Programming

Analysis of Algorithms I: Asymptotic Notation, Induction, and MergeSort

Greedy Algorithms My T. UF

CS483 Design and Analysis of Algorithms

CS 4104 Data and Algorithm Analysis. Recurrence Relations. Modeling Recursive Function Cost. Solving Recurrences. Clifford A. Shaffer.

More Dynamic Programming

1 Assembly Line Scheduling in Manufacturing Sector

Algorithmic Approach to Counting of Certain Types m-ary Partitions

COS597D: Information Theory in Computer Science October 19, Lecture 10

ICS141: Discrete Mathematics for Computer Science I

More Dynamic Programming

CSE 421 Dynamic Programming

Quiz 1 Solutions. (a) f 1 (n) = 8 n, f 2 (n) = , f 3 (n) = ( 3) lg n. f 2 (n), f 1 (n), f 3 (n) Solution: (b)

Chapter 4 Divide-and-Conquer

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

Binary Decision Diagrams. Graphs. Boolean Functions

CSCE 222 Discrete Structures for Computing. Review for Exam 2. Dr. Hyunyoung Lee !!!

COMP 355 Advanced Algorithms

Recursion: Introduction and Correctness

CS383, Algorithms Spring 2009 HW1 Solutions

NAME (1 pt): SID (1 pt): TA (1 pt): Name of Neighbor to your left (1 pt): Name of Neighbor to your right (1 pt):

Intractable Problems Part One

COMP 355 Advanced Algorithms Algorithm Design Review: Mathematical Background

Graduate Algorithms CS F-09 Dynamic Programming

Dynamic Programming. Reading: CLRS Chapter 15 & Section CSE 6331: Algorithms Steve Lai

Bayesian Networks and Markov Random Fields

Lecture 2: Regular Expression

RMT 2013 Power Round Solutions February 2, 2013

1. Problem I calculated these out by hand. It was tedious, but kind of fun, in a a computer should really be doing this sort of way.

Algorithm efficiency analysis

Algorithms and Data Structures 2014 Exercises week 5

COL106: Data Structures and Algorithms (IIT Delhi, Semester-II )

Data Structures and Algorithms Chapter 3

COMP 382: Reasoning about algorithms

1 Closest Pair of Points on the Plane

Bayesian Networks Factor Graphs the Case-Factor Algorithm and the Junction Tree Algorithm

Today s Outline. CS 362, Lecture 4. Annihilator Method. Lookup Table. Annihilators for recurrences with non-homogeneous terms Transformations

Dynamic Programming. p. 1/43

Today s Outline. CS 362, Lecture 13. Matrix Chain Multiplication. Paranthesizing Matrices. Matrix Multiplication. Jared Saia University of New Mexico

data structures and algorithms lecture 2

Homework 10 Solution

Recursion and Induction

Aside: Golden Ratio. Golden Ratio: A universal law. Golden ratio φ = lim n = 1+ b n = a n 1. a n+1 = a n + b n, a n+b n a n

Algorithms Design & Analysis. Dynamic Programming

Algorithms Re-Exam TIN093/DIT600

k-protected VERTICES IN BINARY SEARCH TREES

Notes on induction proofs and recursive definitions

Dynamic Programming: Matrix chain multiplication (CLRS 15.2)

Algorithm Design and Analysis

Chapter 2. Recurrence Relations. Divide and Conquer. Divide and Conquer Strategy. Another Example: Merge Sort. Merge Sort Example. Merge Sort Example

Proof Techniques (Review of Math 271)

Notes for Recitation 14

1 Circuit Complexity. CS 6743 Lecture 15 1 Fall Definitions

arxiv: v1 [math.co] 22 Jan 2013

CS 170 Algorithms Fall 2014 David Wagner MT2

1 Examples of Weak Induction

CS 577 Introduction to Algorithms: Strassen s Algorithm and the Master Theorem

Non-context-Free Languages. CS215, Lecture 5 c

CSE 421 Weighted Interval Scheduling, Knapsack, RNA Secondary Structure

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction.

Lecture 8: Complete Problems for Other Complexity Classes

CS173 Lecture B, November 3, 2015

University of the Virgin Islands, St. Thomas January 14, 2015 Algorithms and Programming for High Schoolers. Lecture 5

Join Ordering. 3. Join Ordering

Binomial Coefficient Identities/Complements

1 Overview. 2 Extreme Points. AM 221: Advanced Optimization Spring 2016

Advanced Analysis of Algorithms - Midterm (Solutions)

Transcription:

5-750: Graduate Algorithms February, 06 Lecture 7: Dynamic Programming I: Optimal BSTs Lecturer: David Witmer Scribes: Ellango Jothimurugesan, Ziqiang Feng Overview The basic idea of dynamic programming (DP) is to solve a problem by breaking it up into smaller subproblems and reusing solutions to subproblems.. Requirements In generally, there are several requirements to apply dynamic programming to a problem [Algorithm Design, Kleinberg and Tardos]:. Solution to the original problem can be computed from solutions to (independent) subproblems.. There are polynomial number of subproblems.. There is an ordering of the subproblems such that the solution to a subproblem depends only on solutions to subproblems that precede it in this order.. Dynamic Programming Steps. Define subproblems. This is the critical step. Usually the recurrence structure follows naturally after defining the subproblems.. Recurrence. Write solution to (sub)problem in forms of solutions to smaller subproblems, e., recursion. This will give the algorithm.. Correctness. Prove the recurrence is correct, usually by induction. 4. Complexity. Analyze the runtime complexity. Usually: runtime = #subproblems timet osolveeachone But sometimes there are more clever ways. Computing the Number of Binary Search Trees Suppose we have N keys. Without loss of generality, let s assume the keys are integers {,,,..., N}. We ask the question: How many different binary search trees (BST) can we construct for these keys? Note that we need to maintain the property of a BST. That is, the key of a node is greater than all keys in its left sub-tree, and ditto for the right sub-tree.

. Define subproblems Definition.. B(n) = number of BSTs of n-nodes Examples.. B(n) for small n s. B(0) = (the empty tree). B() = (a single node). B() = B() = 5. Recurrence Definition.. n f(n) = f(i) f(n i ) f(0) =, f() =, f() = Note:. Since the function is defined in recursive form, it is necessary to give base case values (e.g., f(0)).. The base cases given here are also the values of B(0), B(), B().. Correctness Claim.4. B(n) = f(n) Proof. (by induction) Base case: It is obvious as B(0) = f(0) =, B() = f() =, B() = f() =.

Inductive case: Let Assume B(n) = f(n) holds for all n < m. We will show B(m) = f(m). S m = {m-node BSTs} S i m = {m-node BSTs with i nodes in its left subtree} We can divides S m into disjoint cases where it has a different number of nodes in its left subtree: B(m) = S m = Note that because we are counting the number in the left sub-tree, excluding the root, the limit of sum goes only up to m. Also note that we must maintain the fundamental property of a BST. So fixing the i means fixing the root node, and the set of nodes that goes to its left or right. Next let s consider S i m. An m-node BST having a i-node left sub-tree means it has a (m i)- node right sub-tree. r m S i m Construct i-node tree i m i Construct (m i )- node tree independently Observation: If we keep the right sub-tree unchanged, and change the left-subtree (for the same i nodes), we will get a different valid m-node tree. Ditto for the right sub-tree. For example, if we have possible different left sub-tree and possible different right sub-tree, we can have = 6 different trees. = B(m) = = B(m) = m = S i m = B(i) B(m i ) m S i m = m B(i) B(m i ) f(i) f(m i ) = f(m) from inductive hypothesis. The Algorithm follows immediately from the recurrence structure. Algorithm Compute B(n) Input: n N Output: B(n) if n = 0 or n = then return else if n = then return else return m B(n) B(n i ) end if

.4 Runtime analysis Define T (n) to be the runtime of B(n). Naive analysis. In the naive implementation of the algorithm, we can analyze the runtime with the recurrence tree. n T (n) = T (i) + n T (n ) + T (n ) F (n) = Θ(n) F (n) is the Fibonacci sequence (Better) Reusing solutions to subproblems. Observe in the recurrence tree below, there are recurrent calls to the same subproblems, for example, B(n ). In the naive implementation, this solution will be computed repeatedly many times. We can improve the runtime by computing it only once, and reuse it for further calls. B(n) B(0) B() B(n ) B(n ) Redundant computation of the same subproblem many times. B(0) B() B(n ) B(n ) Formally, we note there exist dependencies among the subproblems. B(n) depends on B(n ),..., B(), B(0). And B() depends on B(), B(0), etc. The dependency graph is typically a DAG (directed acyclic graph). B(0) B() B() B(n ) B(n) From the DAG, we can specify an order to compute the solutions to sub-problems. In this case, we compute in the order of B(0), B(), B(),, B(n). Runtime Let T i be the cost the compute B(i) given B(0), B(),..., B(i ), aka. all solutions to its dependent subproblems. T i = O(i) (multiplications and additions) T (n) = T i = i= O(i) = O(n ) i= 4

Remark.5. B(n) is also called the Catalan number. It can be approximated by B(n) 4n n / π See https://en.wikipedia.org/wiki/catalan_number for more details. Optimal Binary Search Trees Suppose we are given a list of keys k < k <... < k n, and a list of probabilities p i that each key will be looked up. An optimal binary search tree is a BST T that minimizes the expected search time p i (depth T (k i ) + ). i= where the depth of the root is 0. We will assume WLOG that the keys are the numbers,,..., n. Example.. Input: k i p i 0. 0. 0.6 For n =, there are 5 possible BSTs. The following figure enumerates them and their corresponding expected search times. The optimal BST for the given input is the second tree..7.5.9.8. 5

In general, there are Θ(4 n n / ) binary trees, so we cannot enumerate them all. dynamic programming, however, we can solve the problem efficiently.. Define subproblems By using As we will commonly do with dynamic programming, let us first focus on computing the numeric value of the expected search time for an optimal BST, and then we will consider how to modify our solution in order to find the corresponding BST. Let i j n, and T be any BST on i,..., j. We will define the cost of T j C(T ) = p l (depth T (l) + ) and the subproblems C ij = min C(T ). T on i,...,j The expected search time for the optimal BST is C n.. Recurrence relation Suppose the root of T on i,..., j is k. k T L T R i k- k + j The cost of T is j C(T ) = p l (depth T (l) + ) k j = p l (depth TL (l) + + ) + p k + p l (depth TR (l) + + ) = C(T L ) + C(T R ) + j p l. l=k+ 6

And so we define the recurrence C ij min i k j {C C ij i,k + C k+,j } + j p l if i < j = p i if i = j 0 if i > j. Correctness Claim.. C ij = C ij. Proof. The proof is by induction on j i. The base case is trivial. C ij C ij : Per the previous calculation, C ij is the cost of some BST on i,..., j and C ij is the cost of an optimal BST. C ij C ij : Suppose the root of the optimal BST is k. Then, C ij = C i,k + C k+,j + C i,k + C k+,j + j j p l p l min i k j {C i,k + C k+,j } + = C ij. j p l From the recurrence, we have the following memoized algorithm. Algorithm Expected Search Time of Optimal BST function C(i,j) if C(i, j) already computed then return C(i, j) end if if i > j then return 0 else if i = j then return p i else return min i k j {C(i, k ) + C(k +, j)} + j p l end if end function In order to compute the actual BST, for each subproblem we can also store the root of the corresponding subtree r(i, j) = argmin{c(i, k ) + C(k +, j)}. i k j 7

4. Runtime There are a total of n subproblems, and each subproblem takes O(n) time to compute, assuming all its subproblems are already solved. Thus, the total running time is O(n ). But, we can actually do better than this. Theorem. (Knuth, 97). r(i, j ) r(i, j) r(i +, j). By the theorem, for each subproblem we can restrict the possible values of k to r(i +, j) r(i, j ) + choices instead of trying all i k j. For this case, the total runtime is i= j= r(i +, j) r(i, j ) + = n + where we used the fact that r(, ) n. i= j= n+ = n + i= j= n+ n + i= j= = n + = n + = O(n ) r(i +, j) r(i, j) r(i, j) i= j= r(i, j) + r(n +, j) + j= i= j= n r(i, j) i= j=0 n r(i, j) i= j= r(i, j ) r(n +, j) j= i= j= r(i, j) r(i, n) i= r(i, n) i= 8