Data structures Exercise 1 solution Question 1 Let s start by writing all the functions in big O notation: f 1 (n) = 2017 = O(1), f 2 (n) = 2 log 2 n = O(n 2 ), f 3 (n) = 2 n = O(2 n ), f 4 (n) = 1 = O n (1), f n 5(n) = 3 n = O(3 n ), f 6 (n) = 2 3n = O(2 3n ), f 7 (n) = n n = O(n n ), f 8 (n) = 3 2n = O(3 2n ), f 9 (n) = log( n) = log n 0.5 = 0.5 log n = O(log n), f 10 (n) = log(2 n n 2 ) f 10 (n) = log(2 n n 2 ) = log(2 n ) + log(n 2 ) = n log(2) + 2 log(n) = O(n + log(n)) = O(n), f 11 (n) = log(n 10 ) = 10 log n = O(log n), f 12 (n) = n 2 + log(n) + n = O(n 2 ) Now we can sort these functions from smallest to largest: f 4 f 1 f 9, f 11 f 10 f 2, f 12 f 3 f 5 f 7 f 8 Notable Proofs: f 4 < f 1 : lim f 6 1 n 2017 = 0
f 1 < f 9 : lim 2017 O(log( n)) = 0 f 10 < f 2 : log(2n n 2 ) 2 log 2 n = 0 2 log 2 n = (2 2 log 2 n = 2 log n2 = n 2 = O(n 2 ) f 8 < f 6 : 32n 2 3n = 0 The denominator function grows a lot faster than the numerator function. Another way to compare f 8 (n) = 3 2n and f 6 (n) = 2 3n is to take log on both sides. We will get log(3 2n ) = 2 n log3 = O(2 n ), log(2 3n ) = 3 n log2 = 3 n = O(3 n ) Since 2 n log3 < 3 n for large n, we can conclude that 3 2n = O(2 3n ) Question 2 a. Correct: Assuming function f(n) = n n. Then, lim (n k)n k n n Let s take log lim log(n k)n k log n n = lim (n k) log(n k) n log n = lim n(log(n k) log(n)) k log(n k) = lim n(log(n k) log(n)) k log(n k) = b. False. Let s assume by contradiction that there exist n 0, c 0 such that n > n 0, (f(n)) 2 c f(n). Then n > n 0, f(n) c, in contradiction to the fact that f(n) = Ω(log n). c. Correct: Question 3 f(n) + g(n) = O(f(n) + g(n)) = O(max(f(n), g(n)) and assuming both functions are bigger or equal 1 we get O(max(f(n), g(n)) = O(f(n)g(n)). a. T(n) = T( n) + 1 1) By iteration:
T(n) = T (n 1 2) + 1 = [T (n 1 4) + 1] + 1 = T (n 1 4) + 1 2 = [T (n 1 8) + 1] + 1 2 = T (n 1 1 8) + 1 3 = = T (n2 i ) + 1 i For a constant c = 2. 1 n2 i = 2 1 2 i log n = 1 log n = 2i i = log( log n) 1 T(n) = T (n2 log (log n) ) + 1 log(log n) = T(2) + 1 log(log n) 2) By master method: = Θ(1) + Θ(log(log n)) = Θ(log(log n)) m = log n n = 2 m n = 2 m 2 T(2 m ) = T (2 m 2 ) + 1 S(m) = S ( m 2 ) + 1 a=1, b=2, f(n) = 1 = Θ(1) = Θ(n 0 ) = Θ(n log 1 ) This is case 2 of master method, therefore: S(m) = Θ(log m) T(n) = T(2 m ) = S(m) = Θ(log m) = Θ(log (log n)) b. T(n) = 5T ( n 2 ) + n3 log n 1) By iteration: T(n) = 5T ( n 2 ) + n3 log n = 5 (5T ( n 4 ) + (n 2 ) 3 log ( n 2 )) + n3 log n = 5 (5 (5T ( n 3 8 ) + (n 4 ) log ( n 3 4 )) + (n 2 ) log ( n 2 )) + n3 log n = i 1 5 i T ( n 2 i) + n 3 5j (2 j ) 3 log ( n 2 j) = j=0 Let s find i: n = 1 i = log n. 2i log n 1 5 log n + n 3 log n ( 5 j 8 ) log(2 j ) = j=0 log n 1 5 log n + n 3 log n ( 5 j 8 ) j 5 log n + cn 3 log n = j=0 O(5 log n + n 3 log n) = O(n 3 log n). Obviously T(n) = (n 3 log n) and therefore T(n) = (n 3 log n)
2) By master method: a = 5, b=2, f(n) = n 3 log n n log 5 n 2.32 for ε = 0.5 n 3 log n = (n 2.32+0.5 ) a f ( n ) c f(n) b 5 n3 2 3 log n 2 c n3 log n 5 8 log n c log n 2 5 log n 2 log n c 5 log n 2 log n c c = 5 8 This is case 3 of master method, therefore: T(n) = (n 3 log n) c. T(n) = T(cn) + T((1 c)n) + 1, 0 < c < 1 By substitution method (induction) Without loss of generality, let c 1 c, so that 0 < 1 c 1/2 and 1/2 c < 1. We can start be drawing a recursion tree
The recursion tree is full for log1/(1 c) n levels, contributing together 1+2+4+8+ +2 log 1/(1 c) n, so we guess T(n) = (n) The tree has total log1/c n levels, contributing together 1+2+4+8+ +2 log 1/c n, so we guess O(n ). Lower bound: Guess: T(n) dn for all n n 0 Induction base: T(1) = 1 d 1 for d = 1 Induction step: Assume T(m) dm for all 1 m < n Proof: T(n) = T(cn) + T((1 c)n) + 1 dcn + dn(1 c) + 1 = dn + 1 dn T(n) = (n) Upper bound: revise the guess by subtracting a lower-order term Guess: T(n) dn b for all n n 0, b> 0 is a constant Induction base: T(1) = 1 d 1 -b. Clearly it holds for proper choice of d and b. Induction step: Assume T(m) dm b for all 1 m < n Proof: T(n) = T(cn) + T((1 c)n) + 1 dcn b + dn(1 c) b + 1 = dn 2b + 1 dn for b 1/2 T(n) = O(n) d. T(n) = T ( 3n 5 ) + 2T (n 5 ) + n Again, we can start by recursion tree to make a quess T(n) =O(nlogn) and T(n) = (nlogn) and prove both bounds by induction. Upper bound: Guess: T(n) cn log n for n 2
Induction base: T(2) = const c for proper choice of c.induction step: Assume T(m) cm log m for all 2 m < n T(n) = T ( 3n 5 ) + 2T (n 5 ) + n c 3n 5 log 3n 5 + 2c n 5 log n 5 + n = c 3n 5 log 3 + c 3n 5 log n c 3n 5 log 5 + 2c n 5 log n 2c n log 5 + n = 5 c 3n log 3 + cn log n cn log 5 + n cn log n if c 3n 5 5 1.37cn + n 0 n(1 1.37c) 0 c 0.8 Therefore, T(n) = O(n log n) log 3 cn log 5 + n 0 Lower bound: Guess: T(n) dn log n for all n Induction base: T(1) = 1 d 1 log 1 = 0 Induction step: Assume T(m) dm log m for all 1 m < n T(n) = T ( 3n 5 ) + 2T (n 5 ) + n d 3n 5 log 3n 5 + 2d n 5 log n 5 + n = d 3n 5 log 3 + d 3n 5 log n d 3n 5 log 5 + 2d n 5 log n 2d n log 5 + n = 5 d 3n log 3 + dn log n dn log 5 + n dn log n if d 3n 5 5 1.37dn + n 0 n(1 1.37d) 0 d 0.7 Therefore, T(n) = (n log n). log 3 dn log 5 + n 0 We conclude that T(n) = (n log n). e. T(n) = 2T(n 1) + 1 Use the iteration method: T(n) = 2T(n 1) + 1 = 2[2T(n 2) + 1] + 1 = 4T(n 2) + 2 + 1 = 4[2T(n 3) + 1] + 2 + 1 = 8T(n 3) + 4 + 2 + 1 = = 2 i T(n i) + (1 + 2 + 4 + + 2 n 1 ) After i = n 1 iterations we have reached T(1). T(n) = 2 n 1 T(1) + (2 n 1 1) = (2 n ). Question 4 a. i is looping from 1 to n 1, while j is looping from n down to i + 1. Inside the loop we are doing operations in Θ(1).
So this gives: n 1 i+1 T(n) = 1 = n i 1 + 1 = n 2 (i) = n 2 i=1 j=n n 1 i=1 = n 2 n2 2 + n 2 = n2 2 + n 2 And in big Θ notation: T(n) = Θ(n 2 ) b. We can define a recursive formula to define this method: When n is 0 or 1: T(1) = 1. Else: T(n) = T(n 1) + c, where c is a constant. Or: 1 n = 0 or 1 T(n) = { T(n 1) + c otherwise n 1 T(n) = T(n 1) + c = T(n 2) + c + c = = T(n k) + k c n k = 1 => k = n 1 => T(n) = T(1) + (n 1) c = Θ(n) i=1 n(n 1) 2 c. In the same way, when n is 0 or 1: T(1) = 1. Else if n is even: T(n) = T(n/2) + c. Else (That is n is odd): T(n) = T((n 1)/2) + c. Again, in the same way as before: 1 n = 0 or 1 T(p) = { T(n/2) + c n is even T((n 1)/2) + c otherwise When n is large, there is almost no difference between n/2 and (n-1)/2 : both cut n nearly evenly in half. Consequently, we make the guess that T(n)= Θ( lg n) and prove it by induction. Question 5 a. Assuming we have two inputs: a sorted array A and integer x, we want to be able to return the index of x or -1 if x doesn t include in the array. Let s define the following function: 1) For an array A we will try the elements in jumps of 2 i, i=0,1,2, until we get to the first element that is bigger or equal to x. That is we will try the elements A[1], A[2], A[4],. Let A[2 j ] be the first element that is bigger than x. Clearly, A[2 j-1 ] < x <= A[2 j ].
We will need at most log d + 1 steps to get to A[2 j ], therefore this part takes O(log d). 2) Let A[2 j ] be the first element that is bigger than x. The size of the interval [2 j-1.. 2 j ] is at most d therefore a binary search in A[2 j-1.. 2 j ] will take O(log d) Let s analyze the complexity: O(log d) + O(log d) = O(log d) b. The idea: At each step we compare the medians of the current arrays A and B. Denote m1 to be the median of A and m2 to be the median of B. 1. If m1=m2 then we are done, m1 (=m2) is the median 2. If m1 > m2, the all the elements A[ n + 1.. n] cannot be a median, because they 2 are greater that more the half of lements in array A and half of elements in array B (the left halves of A and B respectively). Similarly, elements B[1.. m ] cannot B 2 a median, since each of them less than half of elements in array B and half of elements in array A. 3. The case m2>m1 is similar to (2) Denote the size of A be n and size of B be m. 1. Let m1 and m2 be the medians of A and B respectively 2. If m1 = m2 then we are done. Return m1 3. else if m1 > m2, ignore the right half of A[ n 2 + 1.. n] and left part of B[1.. m 2 ] 4. else if m2 > m1, ignore the right half of B[ m 2 + 1.. n] and left part of A B[1.. n 2 ] 5. Repeat the process until one of the base cases 1. A is empty. Return the median of B 2. A has one element. Use binary search on B to find the location of the only element of A in B, then return the median of B assuming the only elements of A has been inserted to B. 3. and 4. similar to 1 and 2 Run time: at each step either size A or B is divided by 2. So we get O(logn + logm)