Divide and Conquer Problem Solving Method

Size: px

Start display at page:

Download "Divide and Conquer Problem Solving Method"

Fay Jemimah Stokes
5 years ago
Views:

1 Divide and Conquer Problem Solving Method 1. Problem Instances Let P be a problem that is amenable to the divide and conquer algorithm design method and let P 0, P 1, P 2, be distinct instances of the problem. Each instance consists of a unique set of problem variables and assignments of data values to the variables. For example, suppose P is to find the maximum integer in a list L of n integers. Suppose n = 8 and L = [20, 30, 40, 60, 70, 80, 50, 10]. Note that this instance of P, call it P 0, consists of two variables, n and L, and the data values assigned to n and L. Suppose one of the steps of the algorithm splits L into two sublists called L L and L R and suppose we now have n = 4, L L = [20, 30, 40, 60] and n = 4, L R = [70, 80, 50, 10]. We have now created two new instance of the problem, call them P 1 and P 2, where P 1 consists of the variables (and their values): n = 4, L L = [20, 30, 40, 60] and P 2 consists of: n = 4, L R = [70, 80, 50, 10]. 2. Divide and Conquer Algorithms Divide and conquer is a very common problem solving method that essentially consists of three steps. 1. We divide (or partition) a given instance P i of a problem P into two or more smaller instances. The definition of smaller instance is somewhat problem-dependent but in general, a smaller instance involves less data. For example, in 1, we divided the original list L into two sublists L L and L R each of which is smaller than L. At a higher level of abstraction, the original problem was to find the maximum integer in a list of eight integers, and the two new subproblems are to find the maximum integers in two sublists of length four. 2. We call the algorithm recursively on each new problem instance and the algorithm magically returns the solutions to the subproblems. This is the conquer step. 3. We combine the solutions to all of the problem instances to form a solution to the original problem P. Divide and conquer algorithms generally use recursion to solve the problem and the general format of a recursive divide and conquer algorithm is, function divide-and-conquer(p i ) -- the input is an instance of the problem P when the size of P i is small enough to afford a simple or trivial solution then return the trivial solution create x 2 new, smaller problem instances P i(1), P i(2), P i(3),, P i(x) (divide step) call divide-and-conquer on each problem instance, creating solutions s i(1), s i(2), s i(3),, s i(x) (conquer step) combine each of the solutions s i(1), s i(2), s i(3),, s i(x) into the solution s i for P i (combine step) return s i 3. Divide and Conquer Example Problem: Find the Maximum Integer in a List of Integers In this section, we will use divide and conquer to solve problem P of 1, which is formally defined as, P : Let L be a list containing n 1 integers, with no limits on the integers. Find the maximum integer in L. When designing algorithms, I find it to be very helpful to work out the solution to one or more concrete problems first. Once I understand how to solve the problem for specific problem instances, then I use that knowledge to design an algorithm that works for the general case. Let our problem be P : {n = 8, L = [20, 30, 40, 60, 70, 80, 50, 10]}. Divide Step. Divide L into two sublists: L L = [20, 30, 40, 60] and L R = [70, 80, 50, 10]. Create two new problem instances. P 1 : {n = 4, L L } and P 2 : {n = 4, L R }. L L is the left half of L and L R is the right half of L. Conquer Step. Call the algorithm recursively twice, first on P 1 and then on P 2. The algorithm will magically return the solutions to each of P 1 and P 2, which are s 1 = 60 and s 2 = 80. We will discuss soon how this magic happens. Combine Step. Since s 1 is the maximum integer in the left half of the original list L and s 2 is the maximum integer in the right half of L, then the solution to P is the maximum of s 1 and s 2, which is s P = 80. (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 1

2 So, how does the magic happen? Consider the recursion base case: how small would P i have to be for there to be a trivial solution? Clearly, if L i = [x] then the maximum integer is x. What about L i = [x, y]? When n = 2, the maximum integer in L i = [x, y] is max(x, y), where the max() function returns the maximum of its arguments. Once could carry this one more step and consider a list of size n = 3, but, as we shall see when we analyze the time complexity of our algorithm, it does not improve the algorithmic complexity to do so. This takes care of the base case. What about the divide and the recurse and conquer parts? In our example, when we divided L into left half and right half lists, we ended up with two sublists of size four, each. This is the divide step. Let us trace our example again, this time representing the problem instances in a tree. P 0 : {n = 8, L = [20, 30, 40, 60, 70, 80, 50, 10]} Divide: P 1 : {n = 4, L L = [20, 30, 40, 60]}, P 2: {n = 4, L R = [70, 80, 50, 10]} Conquer: divide-and-conquer(p 1), divide-and-conquer(p 2) Combine: s 1 = 60, s 2 = 80, s P_0 = max(s 1, s 2) = 80 P 1 : {n = 4, L = [20, 30, 40, 60]} P 2 : {n = 4, L = [70, 80, 50, 10]} Divide: P 3 : {n = 2, L L = [20, 30]}, P 4 : {n = 2, L R = [40, 60]} Divide: P 5 : {n = 2, L L = [70, 80]}, P 6 : {n = 2, L R = [50, 10]} Conquer: divide-and-conquer(p 3), divide-and-conquer(p 4) Conquer: divide-and-conquer(p 5), divide-and-conquer(p 6) Combine: s 3 = 30, s 4 = 60, s P_1 = max(s 3, s 4) = 60 Combine: s 5 = 80, s 6 = 50, s P_2 = max(s 5, s 6) = 80 P 3 : {n = 2, L = [20, 30]} P 4 : {n = 2, L = [40, 60]} P 5 : {n = 2, L = [70, 80]} P 6 : {n = 2, L = [50, 10]} Base Case: return max(20, 30) Base Case: return max(40, 60) Base Case: return max(70, 80) Base Case: return max(50, 10) After working through this concrete example, I have a better understanding of how to write the general algorithm, but it is worthwhile to consider another concrete example, this time with a list length n which is not a power of 2. In fact, if n is an odd integer, then when L is divided into two sublists, one sublist will have one fewer element than the other sublist. For example, when n is 3, the left sublist will be of length 2 if we let L 0:n/2, performing integer division, be the left list and L n/2+1:n-1 be the right list. This means the base case must handle the situation when n = 1: it returns L 0 ; and when n = 2: it returns max(l 0, L 1 ). Suppose for this example we have P : {n = 7, L = [6, 3, 5, 9, 2, 4, 7]}. P 0 : {n = 7, L = [6, 3, 5, 9, 2, 4, 7]} Divide: P 1 : {n = 4, L L = [6, 3, 5, 9]}, P 2: {n = 4, L R = [2, 4, 7]} Conquer: divide-and-conquer(p 1), divide-and-conquer(p 2) Combine: s 1 = 9, s 2 = 7, s P0 = max(s 1, s 2) = 9 P 1 : {n = 4, L = [6, 3, 5, 9]} P 2 : {n = 3, L = [2, 4, 7]} Divide: P 3 : {n = 2, L L = [6, 3]}, P 4 : {n = 2, L R = [5, 9]} Divide: P 5 : {n = 2, L L = [2, 4]}, P 6 : {n = 1, L R = [7]} Conquer: divide-and-conquer(p 3), divide-and-conquer(p 4) Conquer: divide-and-conquer(p 5), divide-and-conquer(p 6) Combine: s 3 = 6, s 4 = 9, s P1 = max(s 3, s 4) = 9 Combine: s 5 = 4, s 6 = 7, s P2 = max(s 5, s 6) = 7 P 3 : {n = 2, L = [6, 3]} P 4 : {n = 2, L = [5, 9]} P 5 : {n = 2, L = [2, 4]} P 6 : {n = 1, L = [7]} Base Case: return max(6, 3) Base Case: return max(5, 9) Base Case: return max(2, 4) Base Case: return L 0 Now that I have studied these two examples, I have a very good understanding of how to write the algorithm for the general case, which is documented on the next page in both outline form and semi-formal pseudocode form. (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 2

3 function max_list(n, L) Base case: When n is 1, return L 0, and when n is 2, return max(l 0, L 1 ). Divide: Create two problem instances: P 1 : { n/2, L L } and P 2 : { n/2, L R } where L L L 0: n/2 and L R L n/2 : n-1. Conquer: Let s 1 max_list({ n/2, L L }) and s 2 max_list({ n/2, L R }). Combine: return max(s 1, s 2 ). And here is my solution documented in semi-formal form. function max_list(n, L) when n = 1 then return L 0 = 2 then return max(l 0, L 1 ) n L n/2, L L L 0: n/2 n R n/2, L R L n/2 : n-1 s 1 max_list(n L, L L ) s 2 max_list(n R, L R ) return max(s 1, s 2 ) -- Base case -- Divide -- Conquer -- Combine And here is the Python implementation. Note: in Python list slices, e.g., L[begin:end], the element at index end is not included in the slice, so we have to account for that. def max_list(n, L): # Base case if n == 1: return L[0] elif n == 2: return max(l[0], L[1]) # Divide n_left = int(ceil(n / 2)); L_left = L[0:n_left] n_right = int(floor(n / 2)); L_right = L[n_left:] # Conquer s1 = max_list(n_left, L_left) s2 = max_list(n_right, L_right) # Combine return max(s1, s2) or, if one wants to be cryptic, def max_list(n, L): if n == 1: return L[0] elif n == 2: return max(l[0], L[1]) else: return max(max_list(int(ceil(n/2)),l[:int(ceil(n/2))]),max_list(int(floor(n/2)),l[int(ceil(n/2)):])) 4. Time Complexity Analysis of max_list() How efficient is max_list()? That question usually implies that we are interested in knowing how much time the algorithm takes to find a solution, but we could also discuss how efficient an algorithm is in terms of the amount of memory that the involved data structures consume, which is referred to as space efficiency. There are two different ways of measuring the time for an algorithm. One method is to use a software timer (or a hard - ware timer for more accuracy if your system provides that capability) that measures the time from the beginning of the algorithm to when the solution is obtained, in seconds, or in some other time units. The problem with that approach is that it only tells us how much time the algorithm takes to complete when implemented in some programming language L 1 on a specific computer system S 1 with specific hardware characteristics. With those numbers in hand, it is difficult to impossible to determine how much time the algorithm will take on a different system S 2 with different hardware characteristics when implemented in some other programming language L 2. (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 3

4 The second method is to analyze the algorithm to determine the time (or space) complexity, with complexity usually being expressed in big O notation. This method is completely independent of programming language and hardware characteristics, so if we know that an algorithm has time complexity O(n 2 ), then that algorithm will take n 2 time on the slowest computer on Earth and will also take n 2 time on the fastest computer in the Universe. We can employ tricks when writing the code that may slightly improve the time efficiency, but no amount of programming language trickery will change the time complexity, unless, of course, we devise a whole new algorithm which also solves the problem with an order of magnitude decrease in the time complexity. In order to specify the time complexity of max_list we start with the formal definition of big O, which is, A function f(n) is O(g(n)) iff f(n) C g(n), for all n > n 0. n 0 and C are positive, real, constants. The parameter n specifies the size of the problem, where the definition of size is somewhat problem-dependent. For example, in the max_list problem, the size of a specific problem instance is n, the number of elements in the list. The function f(n) is a measure of how time-consuming the so called key operation of the algorithm is, when the size of the problem is n. Generally, algorithms take longer to run on larger inputs than on smaller inputs, so f(n) grows larger as n grows larger. The function g(n) is a function we choose when trying to prove that our algorithm is O(g(n)). Common g(n) functions include, g(n) Order of Growth Common Complexity Class Name g(n) = c O(1) c is a constant, so this order of growth is called constant complexity g(n) = log n O(log n) Logarithmic complexity g(n) = n O(n) Linear complexity g(n) = n log n O(n log n) Linearithmic complexity (Linear from the n term + logarithmic from the log n term) g(n) = n 2 O(n 2 ) Quadratic complexity g(n) = n 3 O(n 3 ) Cubic complexity g(n) = n c O(n c ) Polynomial complexity, c is a constant > 1 g(n) = 2 n O(2 n ) Exponential complexity (power of 2) g(n) = c n O(c n ) Exponential complexity (worse than power of 2), c is a constant > 1 g(n) = n! O(n!) Factorial complexity These are ranked in order of growth from the best, O(1) or constant complexity, to the worst, O(n!) or factorial complexity. In general, algorithms with linearithmic complexity or better are amenable to programming solutions. Some quadratic and cubic time algorithms are feasible, but only when n is relatively small. For example, the well-known bubble sort algorithm has time complexity O(n 2 ) and for n < 100, bubble sort works about as fast as any other sorting algorithm; it is only when n exceeds 100 that the physical time it takes to sort a list becomes a barrier; bubble sort on a list of 1,000 elements may take hours to complete. Algorithms with polynomial complexity or worse are almost never solvable unless n is very small, although there are often algorithmic design methods which can find approximate solutions (rather than exact solutions) in a reasonable amount of time. These types of algorithms are called approximation algorithms. Note when choosing g(n), we are attempting to find an upper bound on the time (or space) complexity of an algorithm. The definition of big O essentially states that for all values of n, beyond a point n > n 0, that f(n) is less than some constant multiplied by g(n). So, if it turns out that our function f(n) is 2n + 5, then we can choose g(n) = n and prove that f(n) is O(n). However, we could also choose g(n) = n 3, and this g(n) would also be an upper bound on f(n) because f(n) = 2n + 5 can be shown to always be less than C n 3 for all n > n 0. For this reason, we choose g(n) to be the tightest lower bound possible, so we always choose the smallest g(n) we can. Enough theory for now. How do we find f(n)? The first step is to identify what is called the key operation of the algorithm. The key operation is the one, out of all of the operation, that will be performed the most times and the key opera - tion is often an operation that is at the very core or essence of the algorithm. For example, bubble sort works by repeated - ly comparing adjacent list elements and swapping them when they are out of order. For bubble sort, we would define the (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 4

5 key operation to be the comparison of list i > list i+1 to see if the elements are out of order and need to be swapped. After identifying the key operation, we derive a function f(n) which counts how many times the key operation is performed on a problem of size n. Let us consider max_list. The key operation cannot be in the base case because the base case is only executed once: only when the size of the problem is small enough to permit a trivial solution. The operation that divides L into L L and L R is performed once on every problem instance when max_list() is called. In a sense, dividing the list into two halves is at the heart or core of a divide and conquer algorithm, so this operation could be our key operation. What about the conquer step, when we recursively call max_list twice? That too, is an operation that is at the core of a divide and conquer algorithm (that is the conquer part, after all), and would also be a viable candidate for the key operation. However, every time the list is divided into two halves, both of those halves become parameters to recursive calls to max_list(), so the total number of divide operations will be the same as the total number of function operations. We are now left with the final function call to max() in the return statement, which will also be performed the same number of times as the divide and conquer operations. Of these three operations, the one that makes most sense to be the key operation is the divide operation, because it will be performed the most number of times out of all of the operation and also because the divide step is at the core of the algorithm. Consequently, we have identified the key operation to be the divide steps, in bold below, function max_list(n, L) when n = 1 then return L 0 = 2 then return max(l 0, L 1 ) n L n/2, L L L 0: n/2 n R n/2, L R L n/2 : n-1 s 1 max_list(n L, L L ) s 2 max_list(n R, L R ) return max(s 1, s 2 ) -- Base case -- Divide (the key operation; we consider this to be one operation, not two) -- Conquer -- Combine Once the key operation is identified, we derive the function f(n) which counts how many times the key operation is performed on a list of size n. Look back to the example at the top of p. 2, where we started with a list of size n = 8. The function calls that took place and the number of divide operations that took place during each function call are listed below, max_list(8, L): 1 divide operation (splits L into two lists of size 4 each) max_list(4, L): 1 divide operation (splits L into two lists of size 2 each) max_list(4, L): 1 divide operation (splits L into two lists of size 2 each) So we performed the key operation a total of 3 times. Now, look at the example on the bottom of p. 2, where n = 7, max_list(7, L): 1 divide operation (splits L into two lists of size 4 and 3) max_list(4, L): 1 divide operation (splits L into two lists of size 2 each) max_list(3, L): 1 divide operation (splits L into two lists of size 2 and 1) max_list(1, L): 0 divide operations (base case is triggered) (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 5

6 Again, we performed the key operation 3 times. What we need to do now is to generalize this result when n is the size of the list. Note that it will make the analysis a bit easier if we consider a list of size 2 p, p 1, since this will make the sizes of the left and right lists always be the same powers of 2 and the base case will always be triggered on a list of size 2. Note that we could prove that doing so will not change the time complexity of max_list() but for now, trust me, it does not. Consider a list with n = 2 6 = 64 elements, max_list(n = 2 6, L): 1 divide operation (splits L into two lists of size 2 5 each) max_list(n = 2 5, L): 1 divide operation (splits L into two lists of size 2 4 each) max_list(n = 2 4, L): 1 divide operation (splits L into two lists of size 2 3 each) max_list(n = 2 4, L): 1 divide operation (splits L into two lists of size 2 3 each) max_list(n = 2 5, L): 1 divide operation (splits L into two lists of size 2 4 each) max_list(n = 2 4, L): 1 divide operation (splits L into two lists of size 2 3 each) (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 6

7 max_list(n = 2 4, L): 1 divide operation (splits L into two lists of size 2 3 each) We have 31 divide operations, so we know f(64) = 31, but surely there must be an easier way to derive f(n). Consider again a list of size n = 2 p, p 1. When p = 1, n = 2 1 = 2, so the base case is triggered and there are no divide operations. What about when p = 2, i.e., n = 2 2? In this case, there would be one divide operation which divides L into two lists of size 2 2 /2 = 2, and we know that calling max_list() on each of those sublists will not perform any divide operations, so we know that when p = 2, the number of divide operations is 1. Let us make a table of the number of divide operations for the first few values of p (and n), p n = 2 p Number of Divide Operations f(n) = = = 8 3 (1 divide splits L into two lists of size 4, each recursive call 2 of them performs 1 divide) = 16 7 (1 divide splits L into two lists of size 8, each recursive call performs 3 divides) = (1 divide splits L into two lists of size 16, each recursive call performs 7 divides) = (1 divide splits L into two lists of size 32, each recursive call performs 15 divides) So a pattern begins to emerge, specifically, it appears that f(n) = n/2-1 divide operations for a list of size n when n = 2 p, p 1. However, this is not a mathematical proof of the correctness of f(n) and no mathematician worth hiser 1 salt would stop here. To prove that f(n) = n/2-1, n = 2 p, p 1, we can use an inductive proof, i.e., one that uses induction. The word induction in this context means that we are reasoning from a specific case to prove a general case, e.g., if the input domain for a function g(x) is 0 x 4, and g(0) = 1, g(1) = 1, g(2) = 1, g(3) = 1, and g(4) = 1, then we can generalize this and say that g(x) = 1, for all x (in the input domain). By the way, deduction means that we are reasoning from the general case to a specific case, e.g., if g(x) = 1 for all x, then we know that g(0) = 1, g(1) = 1, g(2) = 1, g(3) = 1, and g(4) = 1. 1 After centuries of evolving, English has yet to introduce a gender-neutral third person possessive pronoun, and the lack of it forces politically-correct writers to resort to the ugly and complicated "his/her" pronoun pair. I find this amazing. Consequently, in this document, "hiser" means "his/her". (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 7

8 Mathematical induction is performed in two steps. The first step, the basis step or basis case, is to prove that f(n) = n/2-1 for the first legal value of n, which is n = 2 1 = 2. Note, we do not consider p = 0, n = 2 0 = 1 to be a legal value because one does not need to invoke max_list() to find the maximum integer in a list of size 1; also, f(1) evaluates to -1 which breaks our formula. Basis Case: It is obvious that f(n = 2 1 ) = 0 because the base case of max_list() will be triggered and we will never reach the statements that perform the divide operation. The second step is called the inductive step or inductive case. In this step we assume that f(k) = k/2-1, k = 2 q, q 2 holds or is true and we must show that f(2k) = 2k/2-1 = k - 1. This assumption forms what is called the induction hypothesis and it must be proven true. Inductive Case: Assume f(k) = k/2-1, k = 2 q, q 2 holds, so the algorithm will perform k/2-1 division operations. Now, consider a list with 2k elements: the algorithm will divide L into two equally-sized sublists, which we call L L (the left half of L) and L R (the right half of L); the sizes of L L and L R will be 2k/2 = k. We then call max_list(k, L L ) and max_list(k, L R ). Since we have assumed that f(k) = k/2-1, k = 2 q, q 2 is true, then max_list(k, L L ) will perform k/2-1 divides and so will max_list(k, L R ). Consequently, the number of divide operations on the list with 2k elements will be (k/2-1) + (k/2-1) + 1 = 2k/2-1 = k - 1. To understand the correctness of the proof, we combine the basis and induction steps. The basis step proved that for the first value of n = 2 p = 2, f(n = 2) = 0. The induction step proved that if f(k = 2 q ) = k/2-1, q 2 is true, then f(2k = 2 q+1 ) = k - 1, q 3. These two facts imply this sequence for f: f(2 1 ) = 0, f(2 2 ) = = 1, f(2 3 ) = = 3, f(2 4 ) = = 7, f(2 5 ) = = 15, and so on. An alternative method and a more common method than induction for deriving f(n) would be to create and solve a recurrence relation (see Appendix A), where f(n) is the number of divide operations that are performed on a list of size n = 2 p, p 0. The recurrence relation describes a function which is described by itself, i.e., it is sort of like a recursive func- tion. When we have less than 2 elements in the list, f(n) is 0. Otherwise, for a list of size n = 2 p, p 1, the number of di- vide operations will be twice the number of divide operations on the two sublists of size n/2 plus one. Here is our recurrence relation, f (n) = { 0 n 2 (Base Case) 2 f (n /2) + 1 n > 2 (Recurrence Case) Some recurrence relations are easy to solve and others are a bit more work. One popular method is to use backsubstitution to write out enough recurrence terms until a pattern emerges, which then permits us to generalize the solution. f(n) = 2f(n/2) + 1, n = 2 p, p 1 Eqn. 1 But f(n/2) = 2f(n/2 2 ) + 1, so substituting into Eqn. 1 we have, f(n) = 2(2f(n/2 2 ) + 1) + 1 = 2 2 f(n/2 2 ) Eqn. 2 But f(n/2 2 ) = 2f(n/2 3 ) + 1, so substituting into Eqn. 2 we have, f(n) = 2 2 (2f(n/2 3 ) + 1) = 2 3 f(n/2 3 ) Eqn. 3 But f(n/2 3 ) = 2f(n/2 4 ) + 1, so substituting into Eqn. 3 we have, f(n) = 2 3 (2f(n/2 4 ) + 1) = 2 4 f(n/2 4 ) Eqn. 4 Now a pattern is beginning to emerge, f(n) = 2 q f(n/2 q ) + 2 q q q Eqn. 5 (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 8

9 The recurrence will continue in this manner until the argument to f(n), n/2 q becomes 2 (triggering the base case) which will happen when q = p - 1 (derivation below). n/2 q = 2 When this is true, the base case is triggered and we stop recursively calling max_list() n/2 = 2 q q = lg(n/2) and since n = 2 p, p 1, q = lg(n/2) = lg(2 p /2) = lg(2 p-1 ) = p - 1. Consequently, from Eqn. 5, f(n = 2 p ) = 2 p-1 f(2) + (2 p p p ) = 2 p p-1-1 = 2 p-1-1. Since n = 2 p, then p = lg n, so f(n = 2 p ) = 2 (lg n)-1-1 = 2 (lg n) /2-1 = n/2-1. Whew! All of that just to derive f(n). Deriving f(n) is by far, the most difficult or time-consuming step in proving the time or space complexity of an algorithm. However, we are still not done analyzing the time complexity, as we must now prove that f(n) is O(g(n)) for some g(n). To do that, we can start with the formal definition of big O, dropping the absolute value symbols surrounding f(n) and g(n) since f(n) will never be negative, i.e., we cannot perform a negative number of divide operations. We must prove that, f(n) = n/2-1 C g(n), for all n > n 0, and C is a real, positive constant If we let g(n) = n, then we must prove that n/2-1 Cn, for all n > n 0. Let n 0 be 1. Then, n/2-1 Cn C ( n/2-1) / n C 2-1/ n when n > n 0 = 1, the 1/n term becomes less than 1 and the limit of 1/n converges to 0. Consequently, we have C 2 so we can let C = 2. We have now chosen g(n) = n, n 0 = 1, C = 2 and this proves that f(n) is O(n), i.e., max_list() has linear time complexity. For linear search pseudocode below the key operation is the comparison of L i > max and this is performed exactly f(n) = n times regardless of the size of the list. Consequently, linear search has O(n) time complexity (g(n) = n, C = 1, and n 0 = 1) 2. function max_list(n, L ) -- Linear Search Method max L 0 for i 1 to n - 1 do when L i > max then max L i return max So it seems that we have not gained anything by solving this problem using divide-and-conquer because both algorithms have linear time complexity. Comparing f LS (n) = n and f DC (n) = n/2-1, it appears that in practice, the divide and conquer algorithm should be almost twice as fast. However, this comparison compares apples and oranges: for linear search, the key operation is a simple relational comparison whereas for divide and conquer, the key operation is a more costly divide-the-list-into-two-halves operation, and the divide operation will be substantially slower than the relational comparison, relatively speaking. Furthermore, the divide and conquer algorithm employs recursion which incurs additional cost due to the time that it takes to call a function at the CPU level and also due to the time that it takes for each recursive call of max_list() to allocate and deallocate the required stack frame. Consequently, although our divide and conquer 2 Big O provides an upper bound on the complexity of an algorithm, i.e., the time or amount of space will never exceed the order of growth function but it can be less than it. We often analyze algorithms to determine orders of growths for the best case scenario, the worst case scenario, and an average case scenario. It is possible for an algorithm to have these characteristics: T BEST is O(1), T WORST is O(n 2 ), and T AVG is O(n lg n). If the best, worst, and average case order of growths for an algorithm are all the same O(g(n)) then we say that the algorithm has time or space complexity Θ(g(n)), pronounced, "big theta of g of n." Linear search has time complexity Θ(n). By the way, it also has space complexity Θ(n) because the list has exactly n elements. (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 9

10 max_list() solution has slightly better time complexity (by a constant factor, but not a change in the order of growth complexity), it is likely to actually take more time to complete than the linear search version. The only way to know would be to perform an experimental analysis of both solutions implemented in a programming language and compare the actual physical times (or wall times) the algorithms take on lists of varying sizes. We will leave that analysis as a home work exercise for the student. 5 Summary Divide and conquer is a common problem solving method that is applicable to problems having these characteristics, 1. A problem instance P i of a certain size S i can be divided into two or more new problem instances, P i(1), P i(2), P i(3),..., all of which are smaller, i.e., of size < S i, than the parent problem instance P i. This is the divide step. 2. Each of the two or more new problem instances P i(1), P i(2), P i(3),..., are passed to the divide and conquer function during a recursive call. 3. The recursive function calls return solutions s i(1), s i(2), s i(3),..., to problem instances P i(1), P i(2), P i(3),... This is the conquer step. 4. When the size of a problem instance P i reaches a specific size S i < S min, which permits a trivial solution s i to P i, the function simply returns the trivial solution s i. 5. For problem instance P i, after all of the recursive function calls have been completed for P i(1), P i(2), P i(3),..., we combine the solutions s i(1), s i(2), s i(3),..., to form a solution s i to P i, and then return s i. The time or space complexity can be analyzed, with the function f(n) which counts the number of times the key operation is performed when analyzing for time generally being a recurrence relation, which can often be systematically solved using well known recurrence relation-solving methods. See [1, 7-9]. 6 References 1. Discrete Mathematics - Recurrence Relation, TutorialPoint.com 2. Divide and Conquer, Princeton University 3. Divide and Conquer Algorithms, Khan Academy 4. Divide and Conquer Algorithms, Wikipedia 5. Introduction to Algorithms: Divide and Conquer, Wayne Goddard, Clemson University 6. Introduction to Algorithms, Lecture 3: Divide and Conquer, MIT 7. Solving Recurrences, Jeff Erickson. 8. Recurrence Relation, Wikipedia 9. Recursive Algorithms and Recurrence Relations, Ned Okie, Radford University 10. Mathematical Relations, Wikipedia Revision History Rev Oct 2016 Original revision. Rev Oct 2016 Added Appendix A on recurrence relations and the back substitution method. (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 10

11 Appendix A See [1, 7-9] for a discussion of recurrence relations and methods to solve them. A recurrence relation is a mathematical relation [10] which is recursively defined, e.g., consider this recurrence relation t(n), t(n) = { 3 n < 5 (Base Case) 2t(n 1) + 1 n 5 (Recurrence Case) A recurrence relation always has a base case which gives a trivial solution when the argument n reaches some value or is in some set of values. In this relation, when n reaches 4 t(n) is simply 3. Otherwise, when n 5, t(n) is recursively defined as 2t(n-1) + 1. For example, if n = 8, we can find t(8) by doing the following, t(8) = 2t(8-1) + 1 = 2t(7) + 1, since n = 8 is 5. Consequently, to calculate t(8) we need to know what t(7) is. t(7) = 2t(7-1) + 1 = 2t(6) + 1, since n = 7 is 5. Consequently, to calculate t(7) we need to know what t(6) is. t(6) = 2t(6-1) + 1 = 2t(5) + 1, since n = 6 is 5. Consequently, to calculate t(6) we need to know what t(5) is. t(5) = 2t(5-1) + 1 = 2t(4) + 1, since n = 5 is 5. Consequently, to calculate t(5) we need to know what t(4) is. t(4) = 3, since n = 3 < 5. Now substitute 3 into the equation for t(5) above. t(5) = 2t(4) + 1 = = 7. Now substitute 7 into the equation for t(6) above. t(6) = 2t(5) + 1 = = 15. Now substitute 15 into the equation for t(7) above. t(7) = 2t(6) + 1 = = 31. Now substitute 31 into the equation for t(8) above. t(8) = 2t(7) + 1 = = 63. To solve a recurrence relation means to derive a function t(n) which, given n, will give the same value for t(n) that we would get if we plugged n into the recurrence relation and derived t(n) as we did above. A common method for solving recurrence relations is back substitution, which is what we did above. Note that when we finally reached t(4) the trivial value is 3. We then substituted 3 back into the equation for t(4). That permitted us to determine the value of t(4) which we then substituted back into the equation for t(5). This continued until we back substituted the value of t(7) = 31 into the equation for t(8). Consequently, we need to solve this problem for the general case t(n) and back substitution will lead us to the solution. By the way, there are other methods for solving recurrence relations that are discussed in the references, so this is just one way to solve it. In back substitution we generate the first few equations for t(n), t(n-1), t(n-2), t(n-3), and so on until we see an obvious pattern. Here we go, t(n) = 2t(n-1) + 1 Eqn. A.1 But, t(n-1) = 2t(n-2) + 1, so back substitute 2t(n-2) + 1 into Eqn A.1 t(n) = 2[2t(n-2) + 1] + 1 = 2 2 t(n-2) Eqn. A.2 But t(n-2) = 2t(n-3) + 1, so back substitute 2t(n-3) + 1 into Eqn A.2 t(n) = 2 2 [2t(n-3) + 1] = 2 3 t(n-3) Eqn. A.3 But t(n-3) = 2t(n-4) + 1, so back substitute 2t(n-4) + 1 into Eqn A.3 t(n) = 2 3 [2t(n-4) + 1] = 2 4 t(n-4) Eqn. A.4 But t(n-4) = 2t(n-5) + 1, so back substitute 2t(n-5) + 1 into Eqn A.4 t(n) = 2 4 [2t(n-5) + 1] = 2 5 t(n-5) Eqn A.5 At this time, we can stop because the pattern we were looking for has emerged, t(n) = 2 q t(n - q) + 2 q q q Eqn A.6 We now need to determine what q is. To do that, we consider what q must be so that t(n-q) will be t(4), thus triggering the base case, which gives us t(n-q) = 3. Well, if we want t(n-q) to be t(4), then clearly we must have n - q = 4. Solving for q, we have q = n - 4. Therefore, t(n) = 2 n-4 t(n - (n - 4)) + 2 n n n t(n) = 2 n Σ [i=0 to n-5] 2 i Eqn A.7 (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 11

12 It is well-known that Σ [i=0 to n-1] 2 i = n n-1 = 2 n - 1. Hence, t(n) = 2 n Σ [i=0 to n-5] 2 i t(n) = 3 2 n-4 + (2 n-4-1) t(n) = 4 2 n-4-1 t(n) = n-4-1 t(n) = 2 n-2-1 Eqn. A.8 We know from the example above that t(8) = 63 and substituting n = 8 into Eqn. A.8 we get t(8) = = 63. Yay! (c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 12

Principles of Algorithm Analysis

C H A P T E R 3 Principles of Algorithm Analysis 3.1 Computer Programs The design of computer programs requires:- 1. An algorithm that is easy to understand, code and debug. This is the concern of software