CSE 591 Homework 3 Sample Solutions Problem 1 (a) Let p = max{1, k d}, q = min{n, k + d}, it suffices to find the pth and qth largest element of L and output all elements in the range between these two numbers by scanning the list again. Assuming that L = {l 0, l 1,..., l n 1 }, we first make all elements distinct by setting l i = l i n + i for all 1 i n 1. (It is easy to check that no two element are the same in the new list.) Then we apply the Quickselect algorithm to obtain pth and qth largest. In order to guarantee O(n) time, we use median-of-medians method to determine pivot. Following is the pseudocode: Quickselect Algorithm function Quickselect(list, left, right, k) n = right-left+1 if n 5 then Sort(list,left,right) using insertion sort. Return k-th largest element in the sorted subarray pivot = median-of-medians(list,left,right) pivotindex = partition(list, left, right) Here we put large elements in the left part if k pivotindex-left+1 then Return Quickselect(list,left, pivotindex, k) else k = k-pivotindex+left-1 Return Quickselect(list,pivotIndex+1, right, k) Quickselect algorithm runs in O(n) time. We give a simple proof here. During each iteration, we eliminate exactly half of the list and both the median-of-medians algorithm and partition run in O(n) time. Hence T (n) = T (n/2) + O(n), and by the master theorem, T (n) = O(n). 1
Say that pth largest element is P and qth largest element is Q. To form the (unsorted) list F, scan the list and add l i /n to F if P l i Q. Then sort F in O((q p) log(q p)) = O(d log d) time and output the sorted list. b As mentioned in (a), the algorithms runs in O(n + d log d) time. When d is O(1), i.e, a constant, we can ignore the second part and the algorithm runs in O(n) time. If d is Ω(n), the algorithm runs in O(n log n) time. Probem 2 (a) To compute x 2l we can compute (x l ) 2 and to compute x 2l+1 we can compute x(x l ) 2 (a divide and conquer method). This is the idea in the following: function Exponentiation(r,k) ans = 1 for i = m to 1 do if k s ith bit is 1 then ans = ans*r r = r*r end for Return ans We compute the ans correctly for (1) when k = 0, we return 1,(2) when k 0, we compute using the way mentioned above. We need a multiplication operation, and it can be done using Karatsuba algorithm and it takes O(n log 2 3 ) when multiplying two n bit numbers. (b) Then we want to show the asymptotic complexity of this function. Firstly, we have a simple fact: if we multiply a m bit number to a n bit number, the resulting number has at most n + m bits. Secondly, we should observe that at the beginning of each iteration, r ans. Hence, in order to get a O( ) bound, we only need to consider total operations from r r. In the first iteration, it takes O(n log 2 3 ). In the second iterations, it takes O((2n) log 2 3 )... In the m-th iteration, it takes O((2 m 1 n) log 2 3 ) bit operations. Sum- 2
Problem 3 ming these up, the complexity is O( m i=1 (2i 1 n) log 2 3 ) = O(n log 2 3 m i=1 (2log 2 3 ) i 1 ) which is O(n log 2 3 2m log 2 3 1 2 log 2 3 1 ) (a) We have no prior knowledge of any candidate s name or ID. We have the list of all votes, not sorted. Compute the median (using the median-of-medians algorithm). Scan the list to determine how many times the median name appears. If more than n/2, report this candidate as the winner. If not, there is no winner (because if you think about the list if it were sorted, any candidate with more than n/2 votes must be the median). So the time complexity is O(n). (b) We generalize the median approach above. Set our target τ to be n/k + 1. Winner(X, τ) determines all winners (candidates for the run-off) in a list X of size n, each getting at least τ votes, as follows. 1. Determine the median value µ; scan the list to partition it into three sets: S contains those smaller than µ, L contains those larger, and E contains those equal to µ. If E τ, output µ as a winner. 2. if S τ, call Winner(S, τ). 3. if L τ, call Winner(L, τ). Because this terminates on lists of size smaller than τ, and max( S, L ) X /2, the method has a depth of recursion that is at most log k. Hence the method runs in O(n log k) time. Problem 4 The original method merges each polling station s list one by one. It takes O(n + m) to merge lists of length n and m. Hence, during first merge, it takes O(p 1 + p 2 ), during second merge it takes O(p 1 + p 2 + p 3 ),..., in the last merge, it takes O( p i ). The total complexity is O( p i i=2 j=1 p j). In the worst case, it can be O(np) when p 1 is Θ(n). Our purpose is try to devise a divide and conquer algorithm. Here is our idea: function Merge(list set) if there is only list then 3
Return the only list if there are two lists then Merge these two lists as a new list Return the new list Divide list set into two subsets with sizes as equal as possible Merge(first set) Merge(second set) merge resulting lists Return the new list For n voters and p stations, the time complexity is O(n log p). Problem 5 (a) Let T (V 1 ), T (V 2 ) be a two subtrees in T induced on V 1 and V 2 respectively. It suffices to show T (V 1 ) is connected. Suppose not, T (V 1 ) has at least two components. Let e be the edge that we removed from T. Now e has one endpoint in T (V 2 ) and the other endpoint is in T (V 1 ). Let C 1 be one component of T (V 1 ) that contains an endpoint of e, and C 2 be another component of T (V 1 ). If we put e back, C 2 is not affected, and hence still disconnected from C 1, contradicting T is a spanning tree. Hence T (V 1 ) must be connected, and subgraphs induced on V 1 is also connected. A similar argument can be applied on V 2 and T (V 2 ). (b) Let ρ = min V 1 V 2 for all V 1, V 2, we claim that ρ = 1 3. First we show, ρ 1 3 4
No matter what edge we remove, the ratio is 1, hence ρ 1. 3 3 Next, we show that ρ 1. Because we are to minimize the ratio, 3 V 1 V 2. Let e = (a, b) be the edge whose removal yields the most balanced separation, and b V 2. Every vertex has degree at most 3 in T, so b has degree at most 2 after removing e. We have 3 cases to consider. 1. If b has degree 0, then V 1 = V 2 = 1 and ratio is 1. 2. If b has degree 1, then V 2 1 V 1. Otherwise we would pick b s other incident edge as e and have a more balanced partition, contradicting optimality of our choice. Hence V 1 V 2 V 1 1. V 1 +1 2 3. If b has degree 2, let e and e be two incident edges and C 1, C 2 be set of vertices by removing e and e in T (V 2 ) and not containing b. We claim C 1 V 1 and C 2 V 1. Suppose not. Then w.l.o.g, C 1 > V 1, then by removing e in T, we have a better partition, also contradicting our choice of e. Therefore, V 1 V 1 1. V 2 2 V 1 +1 3 In all 3 cases, V 1 V 2 1 3, hence ρ 1 3. 5