Quiz 1 Solutions. (a) f 1 (n) = 8 n, f 2 (n) = , f 3 (n) = ( 3) lg n. f 2 (n), f 1 (n), f 3 (n) Solution: (b)

Similar documents
Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Module 1: Analyzing the Efficiency of Algorithms

Lecture 6 September 21, 2016

Introduction to Algorithms March 10, 2010 Massachusetts Institute of Technology Spring 2010 Professors Piotr Indyk and David Karger Quiz 1

Advanced Data Structures

Assignment 5: Solutions

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26

Divide-and-conquer: Order Statistics. Curs: Fall 2017

Advanced Data Structures

CSCI Honor seminar in algorithms Homework 2 Solution

Lecture 18 April 26, 2012

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17

Problem 5. Use mathematical induction to show that when n is an exact power of two, the solution of the recurrence

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101

Review Of Topics. Review: Induction

Lecture 4: Divide and Conquer: van Emde Boas Trees

Math 391: Midterm 1.0 Spring 2016

CPS 616 DIVIDE-AND-CONQUER 6-1

CSCI 3110 Assignment 6 Solutions

Design and Analysis of Algorithms March 12, 2015 Massachusetts Institute of Technology Profs. Erik Demaine, Srini Devadas, and Nancy Lynch Quiz 1

Problem Set 1. MIT students: This problem set is due in lecture on Wednesday, September 19.

CS 577 Introduction to Algorithms: Strassen s Algorithm and the Master Theorem

Fall 2016 Test 1 with Solutions

1 Substitution method

Data Structures in Java

V. Adamchik 1. Recurrences. Victor Adamchik Fall of 2005

CS325: Analysis of Algorithms, Fall Midterm

Lecture 7: Dynamic Programming I: Optimal BSTs

Recursive Definitions

Problem Set 2 Solutions

Chapter 2. Recurrence Relations. Divide and Conquer. Divide and Conquer Strategy. Another Example: Merge Sort. Merge Sort Example. Merge Sort Example

CS 161 Summer 2009 Homework #2 Sample Solutions

Chapter 5 Divide and Conquer

Advanced Counting Techniques. Chapter 8

Algorithm efficiency analysis

Introduction to Divide and Conquer

Searching. Sorting. Lambdas

University of New Mexico Department of Computer Science. Midterm Examination. CS 361 Data Structures and Algorithms Spring, 2003

data structures and algorithms lecture 2

Problem Set 1. CSE 373 Spring Out: February 9, 2016

Homework 1 Solutions

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3

Divide and Conquer. Arash Rafiey. 27 October, 2016

Midterm Exam. CS 3110: Design and Analysis of Algorithms. June 20, Group 1 Group 2 Group 3

Section Summary. Sequences. Recurrence Relations. Summations Special Integer Sequences (optional)

MIDTERM Fundamental Algorithms, Spring 2008, Professor Yap March 10, 2008

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

Data Structures and Algorithms " Search Trees!!

University of New Mexico Department of Computer Science. Final Examination. CS 561 Data Structures and Algorithms Fall, 2013

More Dynamic Programming

COMP 382: Reasoning about algorithms

Midterm 1 for CS 170

Solving recurrences. Frequently showing up when analysing divide&conquer algorithms or, more generally, recursive algorithms.

Copyright 2000, Kevin Wayne 1

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Integer Sorting on the word-ram

Solutions. Problem 1: Suppose a polynomial in n of degree d has the form

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

1 Caveats of Parallel Algorithms

CMPSCI 311: Introduction to Algorithms Second Midterm Exam

Homework 1 Submission

More Dynamic Programming

Analysis of Algorithms I: Asymptotic Notation, Induction, and MergeSort

COMP 355 Advanced Algorithms

Week 5: Quicksort, Lower bound, Greedy

Analysis of Algorithms

CSE548, AMS542: Analysis of Algorithms, Spring 2014 Date: May 12. Final In-Class Exam. ( 2:35 PM 3:50 PM : 75 Minutes )

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

CS2223 Algorithms D Term 2009 Exam 3 Solutions

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

CS60007 Algorithm Design and Analysis 2018 Assignment 1

CS/COE 1501 cs.pitt.edu/~bill/1501/ Integer Multiplication

Recap: Prefix Sums. Given A: set of n integers Find B: prefix sums 1 / 86

CS361 Homework #3 Solutions

Algorithms Exam TIN093 /DIT602

Computational Complexity - Pseudocode and Recursions

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu

Mergesort and Recurrences (CLRS 2.3, 4.4)

Divide and Conquer. Andreas Klappenecker

COMP 355 Advanced Algorithms Algorithm Design Review: Mathematical Background

COMP Analysis of Algorithms & Data Structures

Divide-and-Conquer Algorithms Part Two

An analogy from Calculus: limits

Data structures Exercise 1 solution. Question 1. Let s start by writing all the functions in big O notation:

Disjoint-Set Forests

CS483 Design and Analysis of Algorithms

Data Structures and Algorithms Chapter 3

Module 1: Analyzing the Efficiency of Algorithms

2.2 Asymptotic Order of Growth. definitions and notation (2.2) examples (2.4) properties (2.2)

Asymptotic Algorithm Analysis & Sorting

CS 161: Design and Analysis of Algorithms

COMP Analysis of Algorithms & Data Structures

1. [10 marks] Consider the following two algorithms that find the two largest elements in an array A[1..n], where n >= 2.

CSE 421 Algorithms: Divide and Conquer

Lecture 4. Quicksort

i=1 i B[i] B[i] + A[i, j]; c n for j n downto i + 1 do c n i=1 (n i) C[i] C[i] + A[i, j]; c n

Searching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9

Divide and Conquer. Andreas Klappenecker. [based on slides by Prof. Welch]

Grade 11/12 Math Circles Fall Nov. 5 Recurrences, Part 2

Transcription:

Introduction to Algorithms October 14, 2009 Massachusetts Institute of Technology 6.006 Spring 2009 Professors Srini Devadas and Constantinos (Costis) Daskalakis Quiz 1 Solutions Quiz 1 Solutions Problem 1. Asymptotic orders of growth [9 points] (3 parts) For each of the three pairs of functions given below, rank the functions by increasing order of growth; that is, find any arrangement g 1, g 2, g 3 of the functions satisfying g 1 = O(g 2 ), g 2 = O(g 3 ). lg n represents logarithm of n to the base 2. (a) f 1 (n) = 8 n, f 2 (n) = 25 1000, f 3 (n) = ( 3) lg n (b) (c) f 2 (n), f 1 (n), f 3 (n) f 2 (n), f 3 (n), f 1 (n) f 1 (n) = 1, f 100 2(n) = 1, f n 3(n) = lg n n f 1 (n) = 2 lg3 n, f 2 (n) = n lg n, f 3 (n) = lg n! f 3 (n), f 2 (n), f 1 (n)

6.006 Quiz 1 Solutions Name 2 Problem 2. Balanced Augmented BST [20 points] (2 parts) In this question, we use balanced binary search trees for keeping a directory of MIT students. We assume that the names of students have bounded constant length so that we can compare two different names in O(1) time. Let n denotes the number of students. (a) Say we have a binary search tree with students last names as the keys with lexicographic dictionary ordering. Let k be the number of students whose last name starts with a given prefix x. For example, if the tree contains 5 names {ABC, ABD, ADA, ADB, ADC} there are k=2 names, ABC and ABD respectively, starting with the prefix x=ab. How can we output a list of all those students in O(k + log n) time? Consider the following procedure for traversing the tree. traverse(tree, prefix): if tree is empty then return name := the last name at the root of tree if prefix is a prefix of name then traverse(left subtree of tree, prefix) output(name) traverse(right subtree of tree, prefix) elseif prefix < name then traverse(left subtree of tree, prefix) else traverse(right subtree of tree, prefix) It suffices to run traverse(the entire balanced BST, x) to find all last names with prefix x. The algorithm has the following properties: It finds all names with prefix x, because it never excludes a subtree that could contain a name with prefix x. It only outputs names with x as a prefix. The running time is O(k+log n), because at each depth it visits at most two nodes for which x is not a prefix. Therefore, it visits at most 2 O(log n) + k nodes. Another correct solution: Another correct solution is to find the lexicographically first last name with prefix x, and then to continue visiting consecutive nodes in lexicographic order as in the in-order tree traversal, until we encounter a name for which x is not a prefix. This procedure visits exactly the same nodes as the first solution, and also runs in O(k + log n) time.

6.006 Quiz 1 Solutions Name 3 Common errors: It is not true that the nodes with prefix x must constitute a connected component in the tree. As a counterexample consider the following tree: ABZ / \ AAA ZZZ \ ABA The nodes with prefix AB are not connected. Finding a successor of a node in the tree does not always take O(1) time. It can take as much as O(log n) time, but for a balanced tree finding the consecutive k keys can be shown to take only O(log n + k) time. (b) Give an algorithm to return the number of nodes k with names starting with a given prefix x in O(log n) time. You are allowed to augment the binary search tree nodes. We can use a similar augmentation of Number of nodes in the left subtree and right subtree which we used for finding the Rank of a node (in PS 2 and lecture/recitation). First traverse down the tree to find a node x first which starts with the prefix x. Then traverse the left subtree of x first to find the minimum node x min (lexicographically) which starts with prefix x. Traverse the right subtree of x first to find the largest node x max (lexicographically) that starts with prefix x. x first, x min and x max can be found in O(log n) time. Using the augmentation we can find the ranks of x min and x max. The required number of nodes is equal to Rank(x max ) Rank(x min ) + 1. Another possible augmentation is for each node keeping track of number of nodes in its subtree which share a common prefix with it for all possible prefixes of that node. e.g A node with key ABCDE will keep track of number of nodes in its subtree that start with prefixes A, AB, ABC, ABCD and ABCDE. Since the words are of constant length, there are constant number of prefixes possible for each node. Now to find the number of nodes which start with prefix x, we traverse the tree to find the first node x first which starts with prefix x. The required number of nodes is the value stored in the augmentation of x first for the prefix x. In both cases, it is easy to see that only nodes on the path traversed while insertion are required to be updated for augmentation value which takes O(log n) time.

6.006 Quiz 1 Solutions Name 4 Problem 3. Employee Dictionary [15 points] (1 part) We are interested in building a phonebook for the employees of a small company of size < 64, using a dictionary. Our hash table will have size m = 64, and the employees last names will be used as keys. The longest employee name is 15 characters long, so to simplify our task we convert every last name into a 15-character string by introducing space characters in front of the first character. For instance, the last name Devadas will be padded with 8 space characters preceding the leading D. Now we encode every letter of the English alphabet together with the space character into a unique 6-bit binary number, so that every last name can be represented uniquely by a 90-bit binary number. Consider the following hash functions for our dictionary. Which of them is likely to perform the best? Justify your choice: For the hash functions you rejected, describe a collection of last names for which the hash function performs poorlyand explain why. For the hash function of your choice, explain why it is better than the other hash functions. The input k to the following functions is the 90-bit binary number corresponding to a last name: 1.h(k) = k mod 64; 2.h(k) = (a k mod 2 18 ) >> 12, for a = 4097; 3.h(k) = (a k mod 2 90 ) >> 84, for a = 1 + 2 6 54605; Note: Observe that 4097 = 1 + 2 12 and 54605 = 1101010101001101 2. The >> operator is the right shift operatr. 1.Note that 64 = 2 6. Hence, the hash function keeps the 6 least significant bits of k, which correspond to the last character of the name since every character is encoded using 6 bits. E.g., if the last name is Devadas, then the value of the hash function will be the 6-bit binary number (in decimal, a number from 0 through 63) encoding the character s. So, names ending in the same letter are hashed to the same number using this hash function. E.g. Clinton, Franklin, Jefferson, etc. are all hashed to the same number. 2.Since a = 1 + 2 12, the operation a k performs the addition of k with k := 2 12 k, which is just k shifted left (as a binary number) by 12 positions. Then the operation mod 2 18 followed by the operation >> 12 keeps the bits 13 through 18 of k + k (counting from the least significant bit, which is taken to be the 1st bit). See Figure 1. If e 1 is the number encoding the last character of the name and e 3 the number encoding the third-to-last character, it can be shown (see Figure 1) that h(k) = (e 1 + e 3 ) mod 64 it is important to note here that there is no carry from position 12 to position 13 when performing the addition of k and k. Hence, last names sharing the same last and third-to-last letters are hashed to the same number. E.g. Jefferson and Petersen are hashed to the same number.

6.006 Quiz 1 Solutions Name 5 k k e 15 90 85 e 15... e 4 e 3 e 4 e 3 e 2 e 1 e 2 e 1... 0 0 19 13 7 1 k + k... e 2 e 1 Figure 1: The second hash function. (e 1 + e 3 ) mod 6 3.Given the value of a selected for this function, the operation a k performs the addition S := k 0 + k 6 + k 8 + k 9 + k 12 + k 14 + k 16 + k 18 + k 20 + k 21, where k i is the input key k shifted left (as a binary number) by i bits. Then the operation mod 2 90 followed by the operation >> 84 selects the bits 85 through 90 of the result. The choice of a here looks better from the previous choice, because (i) there are many shifted copies of k created, which are going to be added to produce the resulting hash, and (ii) the shifts are not in multiples of 6 bits which is the length used to represent a single character (so there is also mixing at the granularity of the encoding of characters). Moreover, the function keeps the most significant positions of S, which allows for the outcomes of the mixing to be visible to the hash value. However, this function may also suffer from poor performance, because the shifts are relatively small compared to the length of the key (the maximum shift is by 21 binary positions, whereas the length of the key is 90 bits). In order for this function to perform better than the previous function, it should be the case that either most of the last names are long (longer than say 12 characters), or that the binary encoding of the letters is selected very carefully. Consider, e.g., what would happen if most of the names had length 9 characters and the binary encoding of the space character was 000000 2. It is easy to see that S 10 k 21, since k 21 is the largest summand in the summation S. So, if the space character was encoded by 0 and a last name had 9 characters, then the most significant non-zero bit of k would be at position 54. This would imply that the most significant non-zero bit of k 21 would be at position 75 (since k 21 is k shifted left by 21 positions), which would imply that the most significant non-zero bit of S is at position 79 (since S 10 k 21 and 10 2 4 ). So the bits 85 through 90 of S would all be zero. So all names of length 9 characters would be hashed to position 0.

6.006 Quiz 1 Solutions Name 6 Problem 4. Asymptotic Runtime Analysis [15 points] (1 part) Professor Devadas has an idea to assign students to recitations based on their Quiz 1 grades. He wants to split all N students into 4 recitations such that the maximum difference between the Quiz 1 grades of any two students in the same recitation is as small as possible. To figure out if this idea is reasonable, he wants to know how small the staff can make this maximum difference. Let the number of students be N and suppose that Quiz 1 scores are all integers in the range 1... M. Professor Daskalakis suggests the following Python code, which takes the value of M and a pre-sorted list of quiz scores: def assign_recitations(m, scores): N = len(scores) a = 1 b = M while a < b: c = (a+b)/2 groups = 1 group_low = scores[0] for i in range(n): if scores[i] > group_low + c: groups = groups + 1 group_low = scores[i] if groups <= 4: b = c else: a = c+1 return a What is the asymptotic running time of this algorithm? Express your answer as a function of N and/or M using Θ-notation. The inner loop (the for loop) runs in constant time per iteration with N iterations. The outer loop is performing binary search on the interval [1, M] and thus takes log 2 M iterations. Thus, the runtime is Θ(log M) Θ(N) Θ(1) = Θ(N log M).

6.006 Quiz 1 Solutions Name 7 Problem 5. Airplane Scheduling [20 points] (1 part) Logan airport management asks you to compute the airport load during one day, which is the maximum number of planes that are on the ground at the same time. You are provided the data for n airplanes arrival and departure times. To simplify computation, the day is divided into m time segments and only these segments are recorded. For the plane i, start[i] denotes the time segment of arrival and end[i] denotes the time segment of departure; 1 start[i] end[i] m. A plane is considered to be on the ground during arrival and departure time segments, as well as all segments in between. For example, if n = 5, m = 10 and planes (arrival, departure) pairs are {(5, 7), (1, 3), (8, 10), (2, 5), (4, 9)}, then the airport load is 3 (during time segment 5 planes 1, 4, and 5 are on the ground). Give an O(m + n) time algorithm to find the airport load. If you provide an algorithm with worse time complexity you will be given partial credit. /* Lists to keep track of arrivals and departures */ for(i = 1; i <= m; i++) A[i]= D[i] = 0 /* Go through (arrival, departure) pairs, incrementing count */ for(j = 1; j <= n; j++) { A[arrival[j]] = A[arrival[j]] + 1 D[departure[j]] = D[departure[j]] + 1 } /* Compute current load and compare with max load */ max_load = 0; load = 0; for(i = 1; i < m; i++) { load = load + A[i]; if (load > max_load) max_load = load; load = load - D[i]; } The first loop is O(m). The second loop is O(n). The third loop is O(m). Note that the max load check has to happen after incrementing load by the number of arrivals and before decrementing by the number of departures, in order to take into acount the fact that a plane is considered to be on the ground during its departure segment. The above two arrays can be merged into a single array C. We will increment C[arrival[j]] and decrement C[departure[j]+1] in the second loop. Then, we keep adding the entries of C and find the maximum sum, which will be the maximum load. The +1 added to departure[j] ensures that we treat a plane as being on the ground during its departure segment.

6.006 Quiz 1 Solutions Name 8 Problem 6. Multiplying two N-bit Numbers [20 points] (2 parts) Let X and Y be two n-bit numbers in base B and we would like to compute their product Z = XY. A naive divide-and-conquer technique can work as follows: Divide X into two (n/2)-bit numbers X 1 and X 0 and similarily Y into Y 1 and Y 0. Let b = B n/2. X = X 1 b + X 0 Y = Y 1 b + Y 0 Z = (X 1 b + X 0 )(Y 1 b + Y 0 ) Z = Z 2 b 2 + Z 1 b + Z 0 where Z 2 = X 1 Y 1, Z 1 = X 1 Y 0 + X 0 Y 1 and Z 0 = X 0 Y 0 can be obtained recursively. Assume addition of two n-bit numbers takes Θ(n) time. Also assume that multiplying an n-bit number by B m also takes Θ(n + m) time. (a) Prove that the above naive divide-and-conquer algorithm runs in Θ(n 2 ) time. Let T (n) denote the running time of the algorithm on n-bit numbers. It consists of several computations. Splitting X into X 1 and X 0, as well as splitting Y into Y 1 and Y 0, takes Θ(n) time. Computing X 1 Y 1, X 1 Y 0, X 0 Y 1 and X 0 Y 0 recursively takes 4T ( n) time, as each subproblem performs multiplication of two n -bit numbers. 2 2 Adding X 1 Y 0 and X 0 Y 1 to compute Z 1 takes Θ(n) time, as it is addition of two n-bit numbers. In the last step, computing Z = Z 2 b 2 + Z 1 b + Z 0 takes Θ(n) time because Z 2 b 2 is multiplication of n-bit number with b 2 = B n, Z 1 b is multiplication of n-bit number with b = B n 2, and there is addition of three numbers of size at most 2n. Finally, we can write: ( n ) T (n) = 4T + f(n), 2 where f(n) = Θ(n). We can apply Master Theorem to solve this recurrence. a = 4, b = 2, log b a = log 2 4 = 2, n log b a = n 2. Therefore, f(n) = Θ(n) = O(n 2 ɛ ) for any 0 < ɛ 1 (take ɛ = 1 for example). From case 1 of Master Theorem it follows that T (n) = Θ(n log b a ) = Θ(n 2 ). (b) It turns out that we only need 3 multiplications to compute Z 2,Z 1 and Z 0 instead of 4, required in the previous case, at the cost of some extra additions as follows: Z 2 = X 1 Y 1, Z 0 = X 0 Y 0 and Z 1 = (X 1 + X 0 )(Y 1 + Y 0 ) Z 2 Z 0 What is the time complexity of this algorithm in terms of n?

6.006 Quiz 1 Solutions Name 9 There are three subproblems in the modified algorithm. Computing X 1 Y 1, X 0 Y 0 and (X 1 +X 0 )(Y 1 +Y 0 ) recursively takes 3T ( n ) time, while all other operations 2 take Θ(n) time (there are two subtractions in computing Z 1, while Z is computed as before). Now, the recurrence is: ( n ) T (n) = 3T + f(n), 2 where f(n) = Θ(n). We can again apply Master Theorem. a = 3, b = 2, log b a = log 2 3, n log b a = n log 2 3. Therefore, f(n) = Θ(n) = O(n log 2 3 ɛ ) for any 0 < ɛ log 2 3 1 (there exists such ɛ because log 2 3 > 1). Again, from case 1 of Master Theorem it follows that T (n) = Θ(n log b a ) = Θ(n log 2 3 ). Note that this algorithm is better then the previous one, since log 2 3 < 2 and, correspondingly, Θ(n log 2 3 ) is lower asymptotic time than Θ(n 2 ).