Searching. Sorting. Lambdas

Similar documents
Recursion. Computational complexity

Quicksort. Where Average and Worst Case Differ. S.V. N. (vishy) Vishwanathan. University of California, Santa Cruz

Data Structures and Algorithms CSE 465

1 Quick Sort LECTURE 7. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

Lecture 4. Quicksort

Fundamental Algorithms

Fundamental Algorithms

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Divide-and-conquer: Order Statistics. Curs: Fall 2017

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort

COMP 250: Quicksort. Carlos G. Oliver. February 13, Carlos G. Oliver COMP 250: Quicksort February 13, / 21

Topic 17. Analysis of Algorithms

Quicksort (CLRS 7) We previously saw how the divide-and-conquer technique can be used to design sorting algorithm Merge-sort

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg

Module 1: Analyzing the Efficiency of Algorithms

Divide and Conquer Strategy

Solutions. Problem 1: Suppose a polynomial in n of degree d has the form

Data Structures and Algorithm Analysis (CSC317) Randomized algorithms

Analysis of Algorithms CMPSC 565

CSCI 3110 Assignment 6 Solutions

Recap: Prefix Sums. Given A: set of n integers Find B: prefix sums 1 / 86

Data Structures and Algorithms Chapter 3

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

CS 310 Advanced Data Structures and Algorithms

Algorithms Test 1. Question 1. (10 points) for (i = 1; i <= n; i++) { j = 1; while (j < n) {

Divide and conquer. Philip II of Macedon

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Divide-and-Conquer Algorithms Part Two

Divide-and-conquer. Curs 2015

Quick Sort Notes , Spring 2010

Algorithms And Programming I. Lecture 5 Quicksort

Analysis of Algorithms. Randomizing Quicksort

CS 161 Summer 2009 Homework #2 Sample Solutions

Bucket-Sort. Have seen lower bound of Ω(nlog n) for comparisonbased. Some cheating algorithms achieve O(n), given certain assumptions re input

Quicksort algorithm Average case analysis

Lecture 5: Loop Invariants and Insertion-sort

data structures and algorithms lecture 2

Recommended readings: Description of Quicksort in my notes, Ch7 of your CLRS text.

CSE 421, Spring 2017, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Asymptotic Algorithm Analysis & Sorting

Data Structures and Algorithms

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

i=1 i B[i] B[i] + A[i, j]; c n for j n downto i + 1 do c n i=1 (n i) C[i] C[i] + A[i, j]; c n

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

CS 470/570 Divide-and-Conquer. Format of Divide-and-Conquer algorithms: Master Recurrence Theorem (simpler version)

CPS 616 DIVIDE-AND-CONQUER 6-1

Big O 2/14/13. Administrative. Does it terminate? David Kauchak cs302 Spring 2013

Review Of Topics. Review: Induction

Lecture 6 September 21, 2016

Data structures Exercise 1 solution. Question 1. Let s start by writing all the functions in big O notation:

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

Average Case Analysis. October 11, 2011

CS361 Homework #3 Solutions

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu

CSE 312, Winter 2011, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

CS313H Logic, Sets, and Functions: Honors Fall 2012

Algorithms and Their Complexity

COMP Analysis of Algorithms & Data Structures

Divide-and-Conquer. a technique for designing algorithms

COMP Analysis of Algorithms & Data Structures

Compare the growth rate of functions

Homework 1 Solutions

Module 1: Analyzing the Efficiency of Algorithms

Central Algorithmic Techniques. Iterative Algorithms

CE 221 Data Structures and Algorithms. Chapter 7: Sorting (Insertion Sort, Shellsort)

Week 5: Quicksort, Lower bound, Greedy

COMP Analysis of Algorithms & Data Structures

Insertion Sort. We take the first element: 34 is sorted

Problem Set 1. CSE 373 Spring Out: February 9, 2016

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

Algorithms, CSE, OSU Quicksort. Instructor: Anastasios Sidiropoulos

Algorithm efficiency analysis

Linear Selection and Linear Sorting

Quiz 1 Solutions. (a) f 1 (n) = 8 n, f 2 (n) = , f 3 (n) = ( 3) lg n. f 2 (n), f 1 (n), f 3 (n) Solution: (b)

Sorting algorithms. Sorting algorithms

Find Min and Max. Find them independantly: 2n 2. Can easily modify to get 2n 3. Should be able to do better(?) Try divide and conquer.

CSE 21 Practice Exam for Midterm 2 Fall 2017

5. DIVIDE AND CONQUER I

Inf 2B: Sorting, MergeSort and Divide-and-Conquer

ITEC2620 Introduction to Data Structures

Analysis of Algorithms

Partition and Select

Algorithms and Data Structures

An analogy from Calculus: limits

Weighted Activity Selection

Data Structures and Algorithms CSE 465

CS 4104 Data and Algorithm Analysis. Recurrence Relations. Modeling Recursive Function Cost. Solving Recurrences. Clifford A. Shaffer.

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

Data Structures and Algorithm Analysis (CSC317) Randomized Algorithms (part 3)

1 Substitution method

1. [10 marks] Consider the following two algorithms that find the two largest elements in an array A[1..n], where n >= 2.

CS325: Analysis of Algorithms, Fall Midterm

Quicksort. Recurrence analysis Quicksort introduction. November 17, 2017 Hassan Khosravi / Geoffrey Tien 1

Mergesort and Recurrences (CLRS 2.3, 4.4)

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Transcription:

.. s Babes-Bolyai University arthur@cs.ubbcluj.ro

Overview 1 2 3

Feedback for the course You can write feedback at academicinfo.ubbcluj.ro It is both important as well as anonymous Write both what you like (so we keep&improve it) and what you don t Best if you write about all activities (lecture,seminar and laboratory)

Data are available in the internal memory, as a sequence of records (k 1, k 2,..., k n ) Search a record having a certain value for one of its fields, called the search key. If the search is successful, we have the position of the record in the given sequence. We approach the search s two possibilities separately: with unordered keys with ordered keys

- unordered keys Problem specification Data: a, n, (k i, i = 0,.., n 1), where n N, n 0. Results: p, where (0 p n 1, a = k p ) or p = 1, if key is not found.

- unordered keys d e f s e a r c h S e q ( e l, l ) : Search f o r an element i n l i s t e l element l l i s t o f e l e m e n t s Return the p o s i t i o n o f the element, 1 i f not found poz = 1 f o r i i n range ( 0, l e n ( l ) ) : i f e l == l [ i ] : poz = i r e t u r n poz n 1 Computational complexity is T (n) = 1 = n Θ(n) i=0

- unordered keys d e f s e a r c h S e q ( e l, l ) : Search f o r an element i n l i s t e l element l l i s t o f e l e m e n t s Return the p o s i t i o n o f the element, 1 i f not found i = 0 w h i l e i <l e n ( l ) and e l!= l [ i ] : i += 1 i f i <l e n ( l ) : r e t u r n i r e t u r n 1 What is the difference between this and the previous version?

- unordered keys Best case: the element is at the first position, T (n) Θ(1). Worst case: the element is in the n-1 position, T (n) Θ(n). Average case: if distributing the element uniformly, the loop can be executed 0, 1,.., n 1 times, so T (n) = 1+2+..+n 1 n Θ(n). Overabll complexity is O(n)

- ordered keys Problem specification Data: a, n, (k i, i = 0,.., n 1), where n N, n 0, and k 0 < k 1 <... < k n 1 ; Results: p, where (p = 0 and a k 0 ) or (p = n and a > k n 1 ) or (0 < p n 1) and (k p 1 < a k p ).

- ordered keys d e f s e a r c h S e q ( e l, l ) : Search f o r an element i n l i s t e l element l l i s t o f o r d e r e d e l e m e n t s Return the p o s i t i o n o f the f i r s t o c c u r r e n c e, or p o s i t i o n where element can be i n s e r t e d i f l e n ( l ) == 0 : r e t u r n 0 poz = 1 f o r i i n range ( 0, l e n ( l ) ) : i f e l <=l [ i ] : poz = i i f poz == 1: r e t u r n l e n ( l ) r e t u r n poz n 1 Computational complexity is T (n) = 1 = n Θ(n) i=0

- ordered keys d e f s e a r c h S u c c e s o r ( e l, l ) : Search f o r an element i n l i s t e l element l l i s t o f o r d e r e d e l e m e n t s Return the p o s i t i o n o f the f i r s t o c c u r r e n c e, or p o s i t i o n where element can be i n s e r t e d i f l e n ( l )==0 or e l <=l [ 0 ] : r e t u r n 0 i f e l >=l [ 1 ] : r e t u r n l e n ( l ) i = 0 w h i l e i <l e n ( l ) and e l >l [ i ] : i += 1 r e t u r n i

- ordered keys Best case: the element is at the first position, T (n) Θ(1). Worst case: the element is in the n-1 position, T (n) Θ(n). Average case: if distributing the element uniformly, the loop can be executed 0, 1,.., n 1 times, so T (n) = 1+2+..+n 1 n Θ(n). Overabll complexity is O(n)

Sequential search Keys are successively examined Keys may not be ordered Uses the divide and conquer technique Keys are ordered

Recursive binary-search algorithm d e f b i n a r y S e a r c h ( key, data, l e f t, r i g h t ) : Search f o r an element i n an o r d e r e d l i s t key element to s e a r c h l e f t, r i g h t bounds o f the s e a r c h Return i n s e r t i o n p o s i t i o n o f key t h a t keeps l i s t o r d e r e d i f l e f t >= r i g h t 1 : r e t u r n r i g h t m i d d l e = ( l e f t + r i g h t ) // 2 i f key < data [ m iddle ] : r e t u r n b i n a r y S e a r c h ( key, data, l e f t, middle ) e l s e : r e t u r n b i n a r y S e a r c h ( key, data, middle, r i g h t ) p r i n t ( b i n a r y S e a r c h (2000, data, 0, l e n ( data ) ) )

Recursive binary-search function d e f s e a r c h ( key, data ) : Search f o r an element i n an o r d e r e d l i s t key element to s e a r c h data the l i s t Return i n s e r t i o n p o s i t i o n o f key t h a t keeps l i s t o r d e r e d i f l e n ( data ) == 0 or key < data [ 0 ] : r e t u r n 0 i f key > data [ 1 ] : r e t u r n l e n ( data ) r e t u r n b i n a r y S e a r c h ( key, data, 0, l e n ( data ) )

Binary-search recurrence The recurrence: T(n) = { 1, n = 1 T ( n 2 ) + 1, n > 1

Iterative binary-search function d e f b i n a r y S e a r c h ( key, data ) : s p e c i f i c a t i o n i f l e n ( data ) == 0 or key < data [ 0 ] : r e t u r n 0 i f key > data [ 1 ] : r e t u r n l e n ( data ) l e f t = 0 r i g h t = l e n ( data ) w h i l e r i g h t l e f t > 1 : m i d d l e = ( l e f t + r i g h t ) // 2 i f key <= data [ m i d d l e ] : r i g h t = m i d d le e l s e : l e f t = m i d d l e r e t u r n r i g h t

Search runtime complexity Algorithm Best case Average Worst case Overall Sequential Θ(n) Θ(n) Θ(n) Θ(n) Succesor Θ(1) Θ(n) Θ(n) O(n) Binary-search Θ(1) Θ(log 2 n) Θ(log 2 n) O(log 2 n)

Demo Collections and search Examine the source code in ex34 search.py

Rearrange a data collection in such a way that the elements of the collection verify a given order. Internal sort - data to be sorted are available in the internal memory External sort - data is available as a file (on external media) In-place sort - transforms the input data into the output, only using a small additional space. Its opposite is called out-of-place. stability - we say that sorting is stable when the original order of multiple records having the same key is preserved

Demo Stable sort example Examine the source code in ex35 stablesort.py

Elements of the data collection are called records A record is formed by one or more components, called fields A key K is associated to each record, and is usually one of the fields. We say that a collection of n records is: Sorted in increasing order by the key K: if K(i) K(j) for 0 i < j < n Sorted in decreasing order: if K(i) K(j) for 0 i < j < n

Internal sorting Problem specification Data: n, K, where K = (k 1, k 2,..., k n ), k i R, i = 1, n Results: K, where K is a permutation of K, having sorted elements: k 1 k 2... k n.

A few that we will study: Bubble sort Quick sort

Selection Sort Determine the element having the minimal key, and swap it with the first element. Resume the procedure for the remaining elements, until all elements have been considered.

algorithm d e f s e l e c t i o n S o r t ( data ) : f o r i i n range ( l e n ( data ) ) : m i n i n d e x = i # Find s m a l l e s t element i n the r e s t o f the l i s t f o r j i n range ( i +1, l e n ( data ) ) : i f data [ j ] < data [ m i n i n d e x ] : m i n i n d e x = j data [ i ], data [ m i n i n d e x ] = data [ m i n i n d e x ], data [ i ]

- time complexity The total number of comparisons is n 1 n n(n 1) 1 = Θ(n 2 ) 2 i=1 j=i+1 Independent of the input data size, what are the best, average, worst-case computational complexities?

- space complexity In-place. Algorithms that use a small (constant) quantity of additional memory. Out-of-place or not-in-space. Algorithms that use a non-constant quantity of extra-space. The additional memory required by selection sort is O(1). is an in-place sorting algorithm.

Direct selection sort d e f d i r e c t S e l e c t i o n S o r t ( data ) : f o r i i n range ( 0, l e n ( data ) 1) : # S e l e c t the s m a l l e s t element f o r j i n range ( i + 1, l e n ( data ) ) : i f data [ j ] < data [ i ] : data [ i ], data [ j ] = data [ j ], data [ i ]

Direct selection sort Overall time complexity: n 1 n i=1 j=i+1 1 = n(n 1) 2 Θ(n 2 )

Insertion Sort Traverse the elements. Insert the current element at the right position in the subsequence of already sorted elements. The sub-sequence containing the already processed elements is kept sorted, so that, at the end of the traversal, the whole sequence is sorted.

Insertion Sort - Algorithm d e f i n s e r t S o r t ( data ) : f o r i i n range ( 1, l e n ( data ) ) : i n d e x = i 1 elem = data [ i ] # I n s e r t i n t o c o r r e c t p o s i t i o n w h i l e i n d e x >= 0 and elem < data [ i n d e x ] : data [ i n d e x + 1 ] = data [ i n d e x ] i n d e x = 1 data [ i n d e x + 1 ] = elem

Insertion Sort - time complexity Maximum number of iterations (worst case) happens if the initial array is sorted in a descending order: n n(n 1) T (n) = (i 1) = Θ(n 2 ) 2 i=2

Insertion Sort - time complexity Minimum number of iterations (best case) happens if the initial array is already sorted: n T (n) = 1 = n 1 Θ(n) i=2

Insertion Sort - Space complexity Time complexity - The overall time complexity of insertion sort is O(n 2 ). Space complexity - The complexity of insertion sort is θ(1) is an in-place sorting algorithm.

Compares pairs of consecutive elements that are swapped if not in the expected order. The comparison process ends when all pairs of consecutive elements are in the expected order.

- Algorithm d e f b u b b l e S o r t ( data ) : done = F a l s e w h i l e not done : done = True f o r i i n range ( 0, l e n ( data ) 1) : i f data [ i ] > data [ i +1]: data [ i ], data [ i +1] = data [ i +1], data [ i ] done = F a l s e

- Complexity Best-case running time complexity order is θ(n) Worst-case running time complexity order is θ(n 2 ) Average running-time complexity order is θ(n 2 ) Space complexity, additional memory required is θ(1) Bubble sort is an in-place sorting algorithm.

Based on the divide and conquer technique 1 Divide: partition array into 2 sub-arrays such that elements in the lower part elements in the higher part. Partitioning Re-arrange the elements so that the element called pivot occupies the final position in the sub-sequence. If i is that position: k j k i k l, for Left j < i < l Right 2 Conquer: recursively sort the 2 sub-arrays. 3 Combine: trivial since sorting is done in place.

- partitioning algorithm d e f p a r t i t i o n ( data, l e f t, r i g h t ) : p i v o t = data [ l e f t ] i = l e f t j = r i g h t w h i l e i!= j : # Find an element s m a l l e r than the p i v o t w h i l e data [ j ] >= p i v o t and i < j : j = 1 data [ i ] = data [ j ] # Find an element l a r g e r than the p i v o t w h i l e data [ i ] <= p i v o t and i < j : i += 1 data [ j ] = data [ i ] # P l a c e the p i v o t i n p o s i t i o n data [ i ] = p i v o t r e t u r n i

- algorithm d e f q u i c k S o r t ( data, l e f t, r i g h t ) : # P a r t i t i o n the l i s t pos = p a r t i t i o n ( data, l e f t, r i g h t ) # Order l e f t s i d e i f l e f t < pos 1 : q u i c k S o r t ( data, l e f t, pos 1) # Order r i g h t s i d e i f pos + 1 < r i g h t : q u i c k S o r t ( data, pos + 1, r i g h t )

- time complexity The run time of quick-sort depends on the distribution of splits The partitioning function requires linear time Best case, the partitioning function splits the array evenly: T (n) = 2T ( n 2 ) + Θ(n), T (n) Θ(n log 2 n)

- best partitioning We partition n elements log 2 n times, so T (n) Θ(n log 2 n)

- worst partitioning In the worst case, function Partition splits the array such that one side of the partition has only one element: T (n) = T (1) + T (n 1) + Θ(n) = T (n 1) + Θ(n) = n Θ(k) Θ(n 2 ) k=1

- Worst case Worst case partitioning appears when the input array is sorted or reverse sorted, so n elements are partitioned n times, T (n) Θ(n 2 )

runtime complexity Algorithm Worst case Average Θ(n 2 ) Θ(n 2 ) Θ(n 2 ) Θ(n 2 ) Bubble sort Θ(n 2 ) Θ(n 2 ) Quick sort Θ(n 2 ) Θ(n log 2 n)

Demo Examine the source code in ex36 sort.py

expressions expressions Small anonymous functions, that you define and use in the same place. Syntactically restricted to a single expression. Can reference variables from the containing scope (just like nested functions). They are syntactic sugar for a function definition.

Demo Examine the source code in ex36 lambdas.py