CS1800 Discrete Structures Profs. Aslam, Gold, & Pavlu Fall 2017 November 10, 2017 Assigned: Mon Nov 13 2017 Due: Wed Nov 29 2017 Instructions: Written Homework 5 The assignment has to be uploaded to blackboard by the due date. NO assignment will be accepted after 11:59pm on that day. We expect that you will study with friends and often work out problem solutions together, but you must write up your own solutions, in your own words. Cheating will not be tolerated. Professors, TAs, and peer tutors will be available to answer questions but will not do your homework for you. One of our course goals is to teach you how to think on your own. We require that all homework submissions be neat and easily readable. We recommend using a word processor like Microsoft Word or L A TEX for your submissions. If you scan your homework, you may lose points if the scan is not legible. To get full credit, show INTERMEDIATE steps leading to your answers, throughout. You can give answers that are probabilities as either fully simplified fractions or decimal answers to at least two nonzero digits. Problem 1 [12 pts (3,3,3,3)]: Sequences For each of the following lists of integers, (1) indicate whether the sequence is arithmetic, geometric, quadratic, or none of these and (2) give a simple formula that generates the terms of this sequence, where your list elements begin at n = 1. For example, the sequence is arithmetic and generated by the formula starting at n = 1. i. 5, 1, 3, 7, 11, 15,... 3, 5, 7, 9, 11,... a n = 2n + 1 Solution: This is an arithmetic sequence since the difference between consecutive terms is a constant (in this case, 4). The sequence starts at 5 for n = 1 and increases by 4 for each subsequent term. As such, the solution is a n = 5 + (n 1) 4 = 4n 9 1
ii. 2, 6, 18, 54, 162, 486,... Solution: This is a geometric series since the ratio between consecutive terms is a constant (in this case, 3). The sequence starts at 2 for n = 1 and changes by a factor of 3 for each subsequent term. As such, the solution is a n = 2 ( 3) n 1 = 2 3 ( 3)n iii. 0, 5, 16, 33, 56, 85,... Solution: The ratio between consecutive terms is not constant, so the sequence is not geometric. The first differences (5, 11, 17, 23,...) are also not constant, so the sequence is not arithmetic. However, the second differences are constant (in this case, 6), so the sequence is quadratic. Quadratic sequences can be solved in multiple ways; here we set up 3 equations in 3 unknowns to determine the solution. Every quadratic sequence is of the form a n = a n 2 + b n + c for some constants a, b, c. To determine these constants, we set up 3 equations in 3 unknowns by considering the values of a n for n = 1, 2, 3. n = 1 : 0 = a 1 2 + b 1 + c n = 2 : 5 = a 2 2 + b 2 + c n = 3 : 16 = a 3 2 + b 3 + c Simplifying and rearranging the equations, we have: a + b + c = 0 (1) 4a + 2b + c = 5 (2) 9a + 3b + c = 16 (3) Subtracting Equation 1 from Equation 2, and subtracting Equation 2 from Equation 3, we obtain: Finally, subtracting Equation 4 from Equation 5, we obtain 3a + b = 5 (4) 5a + b = 11 (5) 2a = 6. Solving for a, we obtain a = 3. Substituting this value of a into Equations 4 or 5 yields b = 4. Finally, substituting these values of a and b into Equations 1, 2, or 3 yields c = 1. Thus, our solution is a n = 3n 2 4n + 1. 2
iv. 6, 12, 24, 48, 96, 192,... Solution: This is a geometric series since the ratio between consecutive terms is a constant (in this case, 2). The sequence starts at 6 for n = 1 and changes by a factor of 2 for each subsequent term. As such, the solution is a n = 6 2 n 1 = 3 2 n. Problem 2 [16 pts (4,4,4,4)] Sums Evaluate the following sums. You must apply the methods given in class and in the text; i.e., you cannot simply add the numbers with a calculator or write a program. Show your work, and your final answer should be a single integer. i. 9 4 + 1 + 6 + 11 + + 66 Solution: This is an arithmetic series, as the difference between consecutive terms is a constant (in this case, 5). Arithmetic series can be solved in multiple ways; here we apply Gauss s trick as discussed in class and the text. Let S be the sum, and write the series forwards and backwards: S = 9 4 + 1 + 6 + 11 + + 66 S = 66 + 61 + 56 + 51 + 46 + 9 Now, if we sum the two equations, we obtain 2S on the left-hand side. On the right-hand side, summing term-by-term, we obtain 57 for each summand. 2S = 57 + 57 + + 57 The number of terms on the right-hand side is and thus 66 ( 9) 5 As such, we have S = 16 57/2 = 8 57 = 456. ii. 2 + 6 + 18 + 54 + + 1458 + 1 = 16 2S = 16 57. Solution: This is a geometric series, as the ratio between consecutive terms is a constant (in this case, 3). Geometric series can be solved in multiple ways; here we apply the method discussed in class and the text. Let S be the sum, and write the series for S and 3S: S = 2 + 6 + 18 + 54 + + 1458 3S = 6 + 18 + 54 + + 1458 + 4374 Now, if we subtract the first equation from the second, the terms 6, 18,..., 1458 all cancel and we obtain 3S S = 4374 2 3
or simplified As such, we have S = 4372/2 = 2186. 2S = 4372. Derive formulas in terms of n for the following sums. You must show your work, and your final formula should only contain n and integers (but not k). iii. iv. n (3k 1) k=4 Solution: Writing out the terms of the series, we have n (3k 1) = 11 + 14 + 17 + + (3n 1). k=4 This is an arithmetic series, as the difference between consecutive terms is a constant (in this case, 3). Again, arithmetic series can be solved in multiple ways; here we apply the method discussed in class and the text. Let S be the sum, and write the series forwards and backwards: S = 11 + 14 + 17 + + (3n 1) S = (3n 1) + (3n 4) + (3n 7) + + 11 Now, if we sum the two equations, we obtain 2S on the left-hand side. On the right-hand side, summing term-by-term, we obtain 3n + 10 for each summand. 2S = (3n + 10) + (3n + 10) + + (3n + 10). Since the summation is from k = 4 to n, there are n 4 + 1 = n 3 terms; thus, As such, we have n 4 5 k+2 k=3 2S = (n 3) (3n + 10). S = (n 3)(3n + 10) 2 Solution: Writing out the terms of the series, we have n 4 5 k+2 = 4 5 5 + 4 5 6 + + 4 5 n+2. k=3 This is a geometric series, as the ratio between consecutive terms is a constant (in this case, 5). Again, geometric series can be solved in multiple ways; here we apply the method discussed in class and the text. Let S be the sum, and write the series for S and 5S: S = 4 5 5 + 4 5 6 + + 4 5 n+2 5S = 4 5 6 + 4 5 7 + + 4 5 n+3 4
Now, if we subtract the first equation from the second, the terms 4 5 6 + 4 5 7 + + 4 5 n+2 all cancel and we obtain 5S S = 4 5 n+3 4 5 5 or simplified As such, we have S = 5 n+3 5 5. 4S = 4 (5 n+3 5 5 ). Problem 3 [18 pts (6,4,8)]: Comparisons of Functions In class, we discussed quadratic sorting algorithms (Insertion-Sort and Selection-Sort) and an n lg n sorting algorithm (Merge-Sort). Dozens, if not hundreds, of other sorting algorithms have been developed. Another well-known sorting algorithm is Shell-Sort whose asymptotic running time is on the order of n lg 2 n, when implemented appropriately. In the problems that follow, you will compare these three algorithms for sorting. Ignoring lower order terms and constant factors, let T 1 (n), T 2 (n), and T 3 (n) be the effort required by Insertion-Sort, Shell-sort, and Merge-Sort, respectively, to sort a list of length n. We have where lg n is log 2 (n). T 1 (n) = n 2 T 2 (n) = n lg 2 n T 3 (n) = n lg n i. Suppose that you were given a budget of 100,000 units of effort. For each of the three algorithms, determine the largest list length such that the sorting effort required is guaranteed to be at most 100,000. Solution: For T 1 (n) = n 2, we want n such that n 2 100000. The solution is As such, n = 316 is the largest such list. n 100000 = 316.228. For T 2 (n) = n lg 2 n we want n such that n lg 2 n 100000. This cannot be solved analytically, but as stated in the hint, you can easily use binary search. We essentially want to solve the equation n lg 2 n = 100000. To apply binary search, we start with two values of n, one too small and one too large, and then we examine the midpoint between these values. If the midpoint is too small, we repeat on the right-half; if too large, we repeat on the left half. We can trivially start with n = 1 as a value that is too small since 1 lg 2 1 < 100000 and n = 100000 as a 5
value that is too large since 100000 lg 2 100000 > 100000. Applying binary search, we obtain the following tests and results: Thus we see the solution is 1005. n n lg 2 n 1 0.00 100000 27588015.67 50000 12183043.79 25000 5336039.87 12500 2315278.92 6250 993768.96 3125 421199.22 1563 175953.58 782 72234.75 1172 121809.55 977 96379.97 1074 108882.44 1025 102528.87 1001 99444.94 1013 100984.56 1007 100214.17 1004 99829.41 1005 99957.63 1006 100085.88 For T 3 (n) = n lg n we want n such that n lg n 100000. This cannot be solved analytically, but as before we can use binary search. We essentially want to solve the equation n lg n = 100000. To apply binary search, we start with two values of n, one too small and one too large, and then we examine the midpoint between these values. If the midpoint is too small, we repeat on the right-half; if too large, we repeat on the left half. We can trivially start with n = 1 as a value that is too small since 1 lg 1 < 100000 and n = 100000 as a value that is too large since 100000 lg 100000 > 100000. Applying binary search, we 6
obtain the following tests and results: Thus we see the solution is 7740. n n lg n 1 0.00 100000 1660964.05 50000 780482.02 25000 365241.01 12500 170120.51 6250 78810.25 9375 123699.40 7812 101020.69 7031 89852.76 7421 95414.75 7616 98206.93 7714 99612.91 7763 100316.58 7738 99957.51 7750 100129.85 7744 100043.68 7741 100000.59 7739 99971.87 7740 99986.23 ii. How many times larger is the list that Merge-Sort can handle, as compared to the lists that Insertion-Sort and Shell-sort can handle? How many times larger is the list that Shell-sort can handle, as compared to the list that Insertion-Sort can handle? Solution: The maximum list sizes for Merge-Sort, Shell-sort, and Insertion-Sort are 7740, 1005, and 316, respectively. Thus, (1) Merge-Sort can handle a list 7740/1005 = 7.7 times larger than Shell-sort and 7740/316 = 24.5 times larger than Insertion-Sort; (2) Shell-sort can handle a list 1005/316 = 3.18 times larger than Insertion-Sort. iii. Suppose you are running the three algorithms on three different computers. The computer running Insertion-Sort is 5 times faster than the one running Shell-sort, and the computer running Shell-sort is 20 times faster than the one running Merge-Sort. How large must the list be before the computer running Merge-Sort begins to outperform the one running Shell-sort? How large must the list be before the computer running Shell-sort begins to outperform the one running Insertion-Sort? Solution: Merge-Sort vs. Shell-sort: To sort lists of size n, Merge-Sort takes n lg n units of effort and Shell-sort takes n lg 2 n units of effort. However, since the computer running Merge-Sort is 20 times slower than the computer running Shell-sort, we must find an n such that 20n lg n < n lg 2 n. 7
Dividing both sides of this equation by n lg n, we obtain or 20 < lg n n > 2 20 = 1, 048, 576. Thus, for n = 1, 048, 577 (or larger), Merge-Sort will outperform Shell-sort, even when run on a computer 20 times slower. Shell-sort vs. Insertion-Sort: To sort lists of size n, Shell-sort takes n lg 2 n units of effort and Insertion-Sort takes n 2 units of effort and. However, since the computer running Shell-Sort is 5 times slower than the computer running Shell-sort, we must find an n such that 5n lg 2 n < n 2. This cannot be solved analytically, but we can use binary search as before. We essentially want to solve the equation 5n lg 2 n = n 2. To apply binary search, we start with two values of n, one too small and one too large, and then we examine the midpoint between these values. If the midpoint is too small, we repeat on the right-half; if too large, we repeat on the left half. We can start with n = 5 as a value that is too small since 5 5 lg 2 5 > 5 2 and n = 1000 as a value that is too large since 5 1000 lg 2 1000 = 496, 584.28 and 1000 2 = 1, 000, 000. Applying binary search, we obtain the following tests and results: n 5n lg 2 n n 2 5 134.78 25.00 1000 496584.28 1000000.00 502 202026.37 252004.00 253 80616.16 64009.00 377 138069.78 142129.00 315 108481.04 99225.00 346 123077.31 119716.00 361 130284.09 130321.00 353 126429.15 124609.00 357 128353.41 127449.00 359 129317.95 128881.00 360 129800.82 129600.00 Thus we see the solution is 361, since 5n lg 2 n > n 2 at n = 360 but 5n lg 2 n < n 2 at n = 361. Hint: For each question in part iii, you can write an equation that indicates that the running times of the two algorithms being compared are equal, taking into account the speeds of the computers on which the algorithms are being run. For part i, you can also write an equation to be solved. These equations may involve n lg n and n lg 2 n, and in general, they cannot always be solved analytically. You can solve these equations, however, by using binary search. See Exercise 11.3 in the text for 8
more hints on solving such equations. Your answers need only be accurate to the units place, e.g., 123 as opposed to 123.4567. Problem 4 [12 pts (3,3,3,3)]: Search Algorithms In binary search, we split the list in half, perform one comparison to determine if our target element is in the first or second half, and repeat on the appropriate half as necessary until only one element remains. Since there are at most log 2 n halving operations, we use at most 1 log 2 n = log 2 n comparisons in the worst-case. Now consider ternary search instead. Here we would split the list into thirds, perform (at most) 2 comparisons to determine which third contains our target element, and repeat on the appropriate third as necessary until only one element remains. i. What is the worst-case number of comparisons performed by ternary search? Explain. Solution: If we split a list into thirds, the pieces are of size n/3. If we split into thirds again, the pieces are of size (n/3)/3 = n/3 2. In general, if we split into thirds k times, the pieces would be of size n/3 k. We can do this until the pieces are of size 1; therefore, the maximum number of times we can split a list into thirds corresponds to the value of k where n/3 k = 1. Solving for k, we obtain k = log 3 n. Thus, we can split a list into thirds at most log 3 n times. Each time we split the list, we perform (at most) 2 comparisons. Thus, the worst-case number of comparisons for ternary search is 2 log 3 n. ii. Which algorithm performs fewer comparisons in the worst-case, and by how much? Your answer should be in the form, Algorithm A performs x-times fewer comparisons than Algorithm B, for appropriate values of A, B, and x. Hint: The following mathematical fact may come in handy; it allows one to change the base of a logarithm. log a n = log b n/ log b a. Solution: Ternary search requires 2 log 3 n comparisons (in the worst case); binary search requires log 2 n comparisons (in the worst case). Let s put everything in terms of log 2. Using the fact that log a n = log b n/ log b a, we have that log 3 n = log 2 n/ log 2 3. Thus, ternary search requires 2 log 3 n = 2 log 2 3 log 2 n in the worst case. Since 2/ log 2 3 = 1.262, ternary search performs 1.262 times more comparisons than binary search (in the worst case) or, equivalently, binary search performs 1.262 times fewer comparisons than ternary search. Let us now generalize to k-ary search, where we split the list into k equal size groups and perform (at most) k 1 comparisons to determine the appropriate group on which to repeat. iii. What is the worst-case number of comparisons performed by k-ary search? Explain. Solution: Generalizing the above argument, we can perform a k-ary split at most log k n times, and each split entails (at most) k 1 comparisons for a total of (at most) comparisons. (k 1) log k n 9
iv. What is the integer value of k which minimizes the number of comparisons in the worst-case? Explain. Hint: Ensure that your expression from part iii is in the form f(k) log 2 n for some function f(k). To do this, make use of the fact that log a n = log b n/ log b a, as you did above. Solution: Let s again put everything in terms of log 2. Since log k n = log 2 n/ log 2 k, we have (k 1) log k n = k 1 log 2 k log 2 n. As k grows, k 1 grows much faster than log 2 k, so the factor (k 1)/ log 2 k gets bigger as k grows. Here are the factors for the first few values of k: (2 1)/ log 2 2 = 1 (3 1)/ log 2 3 = 1.262 (4 1)/ log 2 4 = 1.5 (5 1)/ log 2 5 = 1.72 The best value of k is the smallest, k = 2, which corresponds to binary search. Problem 5 [42 pts (8,6,8,6,6,8)]: Induction and Recurrences 1. Solve the following recurrence (assuming T (1) = 1). T (n) = 8T (n/2) + n 3 Solution: We begin by iterating the recurrence until we notice a pattern. T (n) = n 3 + 8T (n/2) = n 3 + 8 ( (n/2) 3 + 8T ((n/2)/2) ) = 2n 3 + 8 2 T (n/2 2 ) = 2n 3 + 8 2( (n/2 2 ) 3 + 8T ((n/2 2 )/2) ) = 3n 3 + 8 3 T (n/2 3 ) We now begin to see a pattern, and our conjecture is that k 1, the iterative pattern T (n) = k n 3 + 8 k T (n/2 k ) holds. We will prove this conjecture true via induction later. Our recurrence will reach a base case when n/2 k = 1 or k = log 2 n. At that point, we obtain T (n) = (log 2 n) n 3 + 8 log 2 n T (n/2 log 2 n ) = n 3 log 2 n + n log 2 8 T (1) = n 3 log 2 n + n 3 Our derivation is complete once we prove our conjecture that k 1, the iterative pattern holds. We do so by induction. T (n) = k n 3 + 8 k T (n/2 k ). 10
Base case: At k = 1 we have T (n) = 1 n 3 + 8 1 T (n/2 1 ) = n 3 + 8T (n/2) which is our original recurrence. Thus, our base case holds. Inductive step: Show that if our pattern holds for k 1, i.e., then our pattern must hold for k, i.e., T (n) = (k 1) n 3 + 8 k 1 T (n/2 k 1 ) T (n) = k n 3 + 8 k T (n/2 k ). Starting with our assumption and iterating the recurrence, we obtain Thus our proof is complete. T (n) = (k 1) n 3 + 8 k 1 T (n/2 k 1 ) = (k 1) n 3 + 8 k 1( (n/2 k 1 ) 3 + 8T ((n/2 k 1 )/2) ) = k n 3 + 8 k T (n/2 k ). 2. The Fibonacci numbers 1, 1, 2, 3, 5, 8,... is a sequence defined by the equation F n = F n 1 + F n 2 where F 1 = 1 and F 2 = 1. Consider a currency system with Fibonacci denominations. In other words, unlike the US currency system which has $1, $5, $10, $20, $50,... denominations, we would have $1, $2, $3, $5, $8, $13,... denominations. In what follows, we will show that the Fibonacci currency system is efficient in that very few bills are needed to make change for any value $d. Consider the standard greedy algorithm for making change: To make change for $d, you would choose the largest denomination less than or equal to d, subtract that denomination value from d, and then repeat for the remainder. For example, to make change for $19 in the US currency system, we would choose a $10 bill, yielding a remainder of $9. We would then choose a $5 bill, yielding a remainder of $4. We would then choose a $1 bill, yielding a remainder of $3, and so on. Our change would then be $10, $5, $1, $1, $1, $1. Applying the same greedy strategy in the Fibonacci system, our change would be $13, $5, $1. 11
i. In making change for $d using the greedy strategy, let F n be the largest Fibonacci denomination value less than or equal to d. (1) Argue that F n d < F n+1. (2) Prove that after choosing an F n bill, the remainder d F n must satisfy d F n < F n 1. Solution: (1) Since F n is the largest Fibonacci denomination less than or equal to d, we have F n d by construction. Furthermore, we must have d < F n+1 or else F n is not the largest Fibonacci denomination less than or equal to d. Combining these results, we have F n d < F n+1. (2) We note that F n+1 = F n + F n 1 by the definition of the Fibonacci sequence. Since d < F n+1, we then have d < F n+1 = F n + F n 1. Subtracting F n 1 from both sides of the above inequality yields d F n < F n 1. ii. Prove by induction that for any dollar value $d, you can make change for $d without using any denomination more than once and without the use of adjacent denominations. For example, while you could make change for $14 using two $5 bills, one $3 bill, and one $1 bill, you would be using a denomination twice (the $5 bill) and you would be using adjacent denominations (the $5 and $3 bills). One could instead use one $13 and one $1 bill, which are non-adjacent and without repetition. Hint: Make use of the results from part i. Solution: Strong induction. The base case d = 1 is obvious; we simply use a single $1 bill, satisfying our constraints. Strong inductive step: Assume true for all k < d; show for d. Let F n be the largest Fibonacci number less than or equal to d; begin by picking a single F n bill. If d = F n, then we are done and have satisfied our constraints. Otherwise, we must make change for the remaining d F n ; since d F n < d, we can make change for this remaining amount using non-adjacent bills without repetition by our inductive hypothesis. Furthermore, we have that d F n < F n 1 from part i. Therefore, when we make change for the remaining d F n, we do not need an F n 1 (or higher) bill. Thus, combining our single F n bill with the change we made for d F n using non-adjacent bills without repetition all strictly smaller than F n 1, we obtain an overall solution meeting our criteria. iii. Prove that if change is being made with n denominations d 1, d 2,..., d n and adjacent denominations are not allowed, then at most n/2 unique denominations can be used. For simplicity, you may assume that n is even; the more general result for n odd or even is that at most n/2 unique denominations can be used. Hint: Try a proof by contradiction, group the denominations in adjacent pairs, and employ the Pigeonhole Principle. Solution: Suppose, for the sake of contradiction, that the claim is false. Then it must be possible to make change with n/2 + 1 (or more) unique denominations and not have adjacent denominations. Consider the denominations in adjacent pairs: {d 1, d 2 }, {d 3, d 4 },..., {d n 1, d n }. There are n/2 such pairs. By the Pigeonhole Principle, if we select n/2 + 1 (or more) unique denominations, then at least one of the adjacent pairs must have been selected 12
twice, violating our constraint. Thus, our assumption is false and the claim must be true. The following is not required, but here is the proof for n odd. Suppose, for the sake of contradiction, that the claim is false. Then it must be possible to make change with n/2 + 1 (or more) unique denominations and not have adjacent denominations. Consider the denominations in adjacent pairs with one denomination left over: {d 1, d 2 }, {d 3, d 4 },..., {d n 2, d n 1 }, d n There are n/2 such groups. By the Pigeonhole Principle, if we select n/2 + 1 (or more) unique denominations, then at least one of the groups must have been selected twice. This cannot be the last group (which contains only one denomination), since we are selecting unique denominations. Therefore, one of the pairs must have been selected twice, violating our constraint. Thus, our assumption is false and the claim must be true. iv. Prove by induction that n 6, F n 2 n/2. Note that 2 n/2 = ( 2) n 1.414 n. Solution: Strong induction. Base cases n = 6 and n = 7 are easy. Strong inductive step: Assume true for all 6 k < n; show for n. We have: F n = F n 1 + F n 2 ( 2) n 1 + ( 2) n 2 = ( 2) n 2 ( 2 + 1) > ( 2) n 2 2 = ( 2) n v. Given the results of the previous parts, prove (not necessarily by induction) that d 8 it is always possible to make change for $d in the Fibonacci currency system using at most log 2 d bills. Hint: Start by considering the largest Fibonacci denomination F n in any such solution and then use the result from part iv above to bound the size of n. You will then need the results from other parts above to finish the proof. Solution: Let F n be the largest denomination used to make change for $d. We must have that F n d. Combining with the result of part iv above, we have that d F n 2 n/2. Solving for n, we obtain n 2 log 2 d. Thus, we are making change with denominations F 1, F 2, F 3, F 4,..., F 2 log2 d which is 2 log 2 d different bills. 1 By the result of part ii, it is possible to use at most one of each such bill; hence, at most 2 log 2 d bills total. Furthermore, by the result of part ii, we need not use adjacent bills. By the result of part iii, without adjacent bills, we would not use more than half the denominations; thus, at most log 2 d bills total. less. 1 Technically, we won t use both F 1 and F 2, since they are both 1, so the number of unique bills is actually one 13