CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 )

Similar documents
CS361 Homework #3 Solutions

Data Structures and Algorithms

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

CSE 4502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions

Week 5: Quicksort, Lower bound, Greedy

Another way of saying this is that amortized analysis guarantees the average case performance of each operation in the worst case.

Review Of Topics. Review: Induction

Solution suggestions for examination of Logic, Algorithms and Data Structures,

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

Amortized analysis. Amortized analysis

1 Quick Sort LECTURE 7. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

Priority queues implemented via heaps

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

Algorithms. Jordi Planes. Escola Politècnica Superior Universitat de Lleida

INF2220: algorithms and data structures Series 1

Average Case Analysis. October 11, 2011

CS 1110: Introduction to Computing Using Python Sequence Algorithms

Quick Sort Notes , Spring 2010

Mon Tue Wed Thurs Fri

AMORTIZED ANALYSIS. binary counter multipop stack dynamic table. Lecture slides by Kevin Wayne. Last updated on 1/24/17 11:31 AM

Fibonacci (Min-)Heap. (I draw dashed lines in place of of circular lists.) 1 / 17

Analysis of Algorithms. Outline 1 Introduction Basic Definitions Ordered Trees. Fibonacci Heaps. Andres Mendez-Vazquez. October 29, Notes.

Algorithms. Algorithms 2.4 PRIORITY QUEUES. Pro tip: Sit somewhere where you can work in a group of 2 or 3

Tutorial 4. Dynamic Set: Amortized Analysis

ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October

Algorithms, CSE, OSU Quicksort. Instructor: Anastasios Sidiropoulos

CS Data Structures and Algorithm Analysis

Heaps and Priority Queues

CPSC 320 Sample Final Examination December 2013

Fundamental Algorithms

Part IA Algorithms Notes

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

Divide-and-conquer: Order Statistics. Curs: Fall 2017

IS 709/809: Computational Methods in IS Research Fall Exam Review

CS 161 Summer 2009 Homework #2 Sample Solutions

Randomization in Algorithms and Data Structures

Solutions. Problem 1: Suppose a polynomial in n of degree d has the form

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Fundamental Algorithms

Fundamental Algorithms

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

Data selection. Lower complexity bound for sorting

Data Structures and Algorithms CSE 465

Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park

Lecture 7: Amortized analysis

CSE 613: Parallel Programming. Lecture 9 ( Divide-and-Conquer: Partitioning for Selection and Sorting )

Lecture 14: Nov. 11 & 13

CE 221 Data Structures and Algorithms. Chapter 7: Sorting (Insertion Sort, Shellsort)

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Hash tables. Hash tables

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Divide-Conquer-Glue Algorithms

Part I: Definitions and Properties

Lecture 4. Quicksort

Fibonacci Heaps These lecture slides are adapted from CLRS, Chapter 20.

CSCE 750 Final Exam Answer Key Wednesday December 7, 2005

Central Algorithmic Techniques. Iterative Algorithms

Algorithms And Programming I. Lecture 5 Quicksort

CSE 312, Winter 2011, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Algorithms Theory. 08 Fibonacci Heaps

Divide-and-Conquer Algorithms Part Two

AC64/AT64/ AC115/AT115 DESIGN & ANALYSIS OF ALGORITHMS

CSCE 750, Spring 2001 Notes 2 Page 1 4 Chapter 4 Sorting ffl Reasons for studying sorting is a big deal pedagogically useful Λ the application itself

Algorithm Analysis Divide and Conquer. Chung-Ang University, Jaesung Lee

MA/CSSE 473 Day 05. Factors and Primes Recursive division algorithm

Linear Time Selection

Weak Heaps and Friends: Recent Developments

Module 1: Analyzing the Efficiency of Algorithms

CSE 202 Homework 4 Matthias Springer, A

Data Structures and Algorithms Winter Semester

CSCI 3110 Assignment 6 Solutions

data structures and algorithms lecture 2

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

Dijkstra s Single Source Shortest Path Algorithm. Andreas Klappenecker

Hash tables. Hash tables

Data Structures and Algorithm. Xiaoqing Zheng

Analysis of Algorithms CMPSC 565

Bin Sort. Sorting integers in Range [1,...,n] Add all elements to table and then

Searching. Sorting. Lambdas

Problem Set 1. CSE 373 Spring Out: February 9, 2016

Exam 1. March 12th, CS525 - Midterm Exam Solutions

Sorting and Searching. Tony Wong

Divide-Conquer-Glue. Divide-Conquer-Glue Algorithm Strategy. Skyline Problem as an Example of Divide-Conquer-Glue

CSE548, AMS542: Analysis of Algorithms, Fall 2016 Date: Nov 30. Final In-Class Exam. ( 7:05 PM 8:20 PM : 75 Minutes )

Premaster Course Algorithms 1 Chapter 3: Elementary Data Structures

Analysis of Algorithms - Midterm (Solutions)

Queues. Principles of Computer Science II. Basic features of Queues

Solutions. Prelim 2[Solutions]

Lecture 4: Best, Worst, and Average Case Complexity. Wednesday, 30 Sept 2009

Breadth First Search, Dijkstra s Algorithm for Shortest Paths

ECOM Discrete Mathematics

Problem Set 2 Solutions

Partha Sarathi Mandal

Quicksort. Where Average and Worst Case Differ. S.V. N. (vishy) Vishwanathan. University of California, Santa Cruz

Selection and Adversary Arguments. COMP 215 Lecture 19

searching algorithms

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015

Transcription:

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26 Homework #2 ( Due: Nov 8 ) Task 1. [ 80 Points ] Average Case Analysis of Median-of-3 Quicksort Consider the median-of-3 quicksort algorithm given in Figure 1. Median-of-3-Quicksort( A[1 : n], n ) Input: An array A[1 : n] of n distinct numbers. Output: A[1 : n] with its numbers sorted in increasing order of value. 1. if n = 2 then 2. if A[2] < A[1] then swap A[1] and A[2] 3. elif n > 2 then 4. x median of A[1], A[2] and A[3] 5. rearrange the numbers of A[1 : n] such that (i) A[k] = x for some k [1, n], (ii) A[i] < x for each i [1, k 1], and (iii) A[i] > x for each i [k + 1, n], 6. Median-of-3-Quicksort( A[1 : k 1], k 1 ) 7. Median-of-3-Quicksort( A[k + 1 : n], n k ) 8. return Figure 1: A variant of standard quicksort algorithm that uses the median of the first three numbers in its input (sub-)array as the pivot. Given an input of size n, in this task we will analyze the average number of element comparisons (i.e., comparisons between two numbers of the input array) performed by this algorithm over all n! possible permutations of the input numbers. We will assume that the partitioning algorithm is stable, i.e., if two numbers p and q end up in the same partition and p appears before q in the input, then p must also appear before q in the resulting partition. (a) [ 10 Points ] Show how to implement steps 4 and 5 of Figure 1 to get a stable partitioning of the numbers in A[1 : n] using only n 1 3 element comparisons on average, where the average is taken over all n! possible permutations of the input numbers. (b) [ 10 Points ] Let t n be the average number of element comparisons performed by the algorithm given in Figure 1 to sort A[1 : n], where n 0 and the average is taken over all n! possible permutations of the numbers in A. Show that 1

0 if n < 2, t n = 1 if n = 2, n 1 3 + 6 n n(n 1)(n 2) k=1 (k 1)(n k)(t k 1 + t n k ) otherwise. (c) [ 20 Points] Let T (z) be a generating function for t n : T (z) = t 0 + t 1 z + t 2 z 2 +... + t n z n +...... Show that T (z) = 12 (1 z) 2 T (z) 8 (1 z) 4 + 24 (1 z) 5. (d) [ 20 Points] Solve the differential equation from part (c) to show that T (z) = 3 7 ( 4 ln (1 z) + 28 9 z + 29 ) (1 z) 2 2 63 735 (1 z)5 + 1 5. (e) [ 15 Points] Use your solution from part (d) to show that for n 0, t n = 12 7 (n + 1)H n 159 49 n 29 147 ( 1)n 2 735 where H n = n k=1 1 k is the nth Harmonic number. 1 Compute the numerical value of t n for 0 n 10. ( ) 5 + 1 n 5 (f) [ 5 Points] Use your solution from part (e) to show that t n = Θ (n log n). ( ) 0, n Task 2. [ 60 Points ] A Linear Sieve In this task we will analyze the running time of a Linear Sieve which is a variant of the original Sieve of Eratosthenes modified to mark each composite exactly once. In contrast, the number of times the original sieve marks a composite C is equal to the number of prime factors of C, and hence for finding all primes in [2, N] it marks all composites in that range around N log log N times in total. The linear sieve we will consider in this task is known as the Sieve of Gries and Misra or the GM Linear Sieve. Figure 2 shows an implementation of the GM linear sieve which uses a supporting data structure D composed of two priority queues and a stack. Indeed, one can show that when external-memory priority queues are used this implementation becomes more I/O-efficient than the standard implementation that does not use priority queues. Of course, in this task we are not concerned about I/O-efficiency. So, we will analyze the internal-memory running time of the implementation shown in Figure 2 when internal-memory priority queues (e.g., binary heaps, binomial heaps) are used. The GM linear sieve uses the following property of composite numbers to reduce the number of times it marks them: each composite number C can be represented uniquely as C = p r q where p is the smallest prime factor of C, r is a positive integer, and either q = p or q (> p) is not divisible by 1 Compare this with t n = 2(n + 1)H n 4n which we obtained when we analyzed standard quicksort in Lecture 7. 2

Linear-Sieve( N ) 1. create support data structure D {find all prime numbers in [2, N].} 2. D.Init( N ), p 1 {initialize support data structure D} 3. while p { N do output all primes [2, } N] 4. p D.InvSucc( p ) {assuming that all composites with value N and divisible by primes [2, p] are already in D, find the smallest integer p > p that does not appear as a composite in D} 5. print p {then this p must be a prime} 6. p p, q p 7. while pq N do ( 8. for r 1 to log p N q ) do {insert each composite of the form p r q with value N into D, 9. D.Insert( p r q ) where either q = p or q > p but is not divisible by p} 10. q D.InvSucc( q ) {find the next q that is not divisble by p} 11. D.Save( q ) {save q as we do not yet know if it s a prime or a composite} 12. D.Restore( ) {restore all saved q s} { 13. while p N do output all primes ( } N, N] 14. p D.InvSucc( p ) 15. if p N then print p {p must be a prime} D.Init( N ) {initialize support data structure D for computing primes in [2, N]} 1. D.Q 1 {2,..., N} {D.Q 1 is a priority queue containing numbers not yet known to be composites} 2. D.Q 2 {D.Q 2 is a priority queue containing composites we have discovered that are yet to be deleted from D.Q 1} 3. D.S {D.S is a stack} D.InvSucc( x ) {return the smallest number larger than x which is not yet known to be a composite} 1. while Find-Min( D.Q 1 ) x do {get rid of all numbers x from D.Q 1} 2. Extract-Min( D.Q 1 ) 3. while Find-Min( D.Q 2 ) Find-Min( D.Q 1 ) do {keep removing numbers from D.Q 1 (in increasing 4. if Find-Min( D.Q 2 ) = Find-Min( D.Q 1 ) then order of value) which also belong to D.Q 2 (i.e., known 5. Extract-Min( D.Q 1 ) to be composites) until finding one that does not belong to D.Q 2} 6. Extract-Min( D.Q 2 ) {remove composites from D.Q 2 which have already been removed from D.Q 1} 7. y Extract-Min( D.Q 1 ) {y is the smallest number in D.Q 1 which does not belong to D.Q 2 and thus not known to be a composite} 8. return y D.Insert( x ) {x is a composite to be deleted from D.Q 1} 1. Insert( D.Q 2, x ) {store x in D.Q 2 for deletion from D.Q 1 at a convenient time later} D.Save( x ) {save x as we do not yet know if it s a prime or not} 1. Push( D.S, x ) {store x in stack D.S} D.Restore( ) {empty the contents of D.S into D.Q 1} 1. while D.S do 2. x Pop( D.S ) {return the contents of stack D.S 3. Insert( D.Q 1, x ) to the priority queue D.Q 1} Figure 2: An implementation of GM linear sieve using two priority queues and a stack. 3

p. Hence, one can generate all composites in a lexicographical order using a triply nested loop with p in the outermost loop, q in the middle and r in the innermost loop, and this will generate/mark every composite exactly once. The support data structure D has three components: two priority queues D.Q 1 and D.Q 2 and one stack D.S. The priority queues support three operations: Insert, Find-Min and Extract- Min. The stack supports Push and Pop. The data structure D itself supports the following four operations (see Figure 2 for details): D.Insert, D.InvSucc, D.Save and D.Restore. It also has an initialization function D.Init. When called with parameter N, the Linear-Sieve function shown in Figure 2 uses this data structure to find all prime numbers in [2, N]. Now answer the following questions. (a) [ 10 Points ] Assuming that D.Q 1 and D.Q 2 are standard binary heaps that support Insert, Find-Min and Extract-Min operations in O (log n), O (1) and O (log n) worst-case time, respectively, where n is the number of items in the heap, find the worst case running times of D.Insert, D.InvSucc, D.Save and D.Restore in terms of N. (b) [ 5 Points ] Based on your results from part (a) give an upper bound on the worst-case running time of Linear-Sieve( N ). (c) [ 30 Points ] Under the assumption that D.Q 1 and D.Q 2 are standard binary heaps as in part (a), show that the amortized times complexities of D.Insert, D.InvSucc, D.Save and D.Restore are Θ (log N), Θ (1), Θ (log N) and Θ (1), respectively. (d) [ 5 Points ] Based on your results from part (c) give an upper bound on the worst-case running time of Linear-Sieve( N ). (e) [ 10 Points ] Suppose D.Q 1 and D.Q 2 are binomial heaps that support Insert, Find-Min and Extract-Min operations in O (1), O (1) and O (log n) amortized time, respectively, where n is the number of items in the heap. Then what amortized bounds do you get for D.Insert, D.InvSucc, D.Save and D.Restore? Based on those bounds give an upper bound on the worst-case running time of Linear-Sieve( N ). Task 3. [ 40 Points ] A Binomial Heap Variant Supporting Decrease-Key Operations We modify the lazy binomial heap implementation (with doubly linked list representation) to support Decrease-Key operations as follows. Let s denote the modified heap by H. Each node x of H will now have a flag called dirty. We will say that node x is clean provided x.dirty = false, otherwise it s dirty. Initially, x.dirty is set to false. Only a Decrease-Key operation performed on x can set x.dirty to true. An Insert( H, x ) operation sets x.dirty = false, creates a B 0 containing x, and adds the new B 0 to the doubly linked list containing all binomial trees of H. A Decrease-Key( H, x, k ) operation is performed provided x.dirty = false and k < x.key. It sets x.dirty = true, creates a new node y and sets y.key = k. Then it performs Insert( H, y ). 4

An Extract-Min( H ) operation first performs a cleanup of H. The way the cleanup phase works depends on the percentage of dirty nodes in H. If the data structure contains more dirty nodes than clean nodes then the cleanup phase involves removing all dirty nodes from H and inserting each clean node as a separate B 0 into the linked list. Otherwise, the cleanup phase proceeds as follows. It scans the doubly linked list in one direction and when it encounters some B k with a dirty root it removes that root from H and inserts its k children into the doubly linked list right in front of the current scan location (meaning that the scan will encounter these k trees before encountering any other tree currently in the linked list). The scan stops when the linked list no longer has a tree with a dirty root. Note that the trees can still have dirty (internal) nodes, but there will be no dirty roots. After the cleanup phase an Extract-Min operation proceeds in exactly the way we saw in the class: convert the doubly linked list representation to the array representation, perform Extract- Min on the array representation, and finally convert the array representation back to the doubly linked list representation. Now answer the following questions. (a) [ 30 Points ] Suppose we want to show that the amortized costs of Insert and Decrease- Key operations are O (1) and O (f(n)), respectively, where n is the number of clean nodes in H and f(n) is any non-decreasing positive function of n. Then what is the best amortized (upper) bound you can get for the cost of an Extract-Min operation? (b) [ 10 Points ] How will you modify the implementation above to also support Find-Min operations in amortized O (1) time? 5