Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Similar documents
b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

CSE 4502/5717: Big Data Analytics

Algorithms And Programming I. Lecture 5 Quicksort

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

Linear Time Selection

Data Structures and Algorithms

Partition and Select

Data Structures and Algorithms CSE 465

Divide and Conquer. Maximum/minimum. Median finding. CS125 Lecture 4 Fall 2016

Lecture 4. Quicksort

data structures and algorithms lecture 2

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Divide and conquer. Philip II of Macedon

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3

CSE101: Design and Analysis of Algorithms. Ragesh Jaiswal, CSE, UCSD

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

Divide-Conquer-Glue. Divide-Conquer-Glue Algorithm Strategy. Skyline Problem as an Example of Divide-Conquer-Glue

Review Of Topics. Review: Induction

Quick Sort Notes , Spring 2010

Data selection. Lower complexity bound for sorting

CS483 Design and Analysis of Algorithms

Divide-and-conquer: Order Statistics. Curs: Fall 2017

5. DIVIDE AND CONQUER I

1 Divide and Conquer (September 3)

Divide and Conquer Strategy

Analysis of Algorithms CMPSC 565

Extended Algorithms Courses COMP3821/9801

Design and Analysis of Algorithms

Divide-and-conquer. Curs 2015

Analysis of Algorithms I: Asymptotic Notation, Induction, and MergeSort

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort

Lecture 3. Big-O notation, more recurrences!!

Asymptotic Analysis and Recurrences

Divide and Conquer CPE 349. Theresa Migler-VonDollen

5. DIVIDE AND CONQUER I

Quicksort (CLRS 7) We previously saw how the divide-and-conquer technique can be used to design sorting algorithm Merge-sort

Week 5: Quicksort, Lower bound, Greedy

Kartsuba s Algorithm and Linear Time Selection

ITEC2620 Introduction to Data Structures

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

CPS 616 DIVIDE-AND-CONQUER 6-1

Asymptotic Algorithm Analysis & Sorting

Analysis of Algorithms - Using Asymptotic Bounds -

CS 577 Introduction to Algorithms: Strassen s Algorithm and the Master Theorem

CMPSCI611: Three Divide-and-Conquer Examples Lecture 2

Mergesort and Recurrences (CLRS 2.3, 4.4)

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015

Design and Analysis of Algorithms

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n

COMP Analysis of Algorithms & Data Structures

Selection and Adversary Arguments. COMP 215 Lecture 19

COMP Analysis of Algorithms & Data Structures

Chapter 2. Recurrence Relations. Divide and Conquer. Divide and Conquer Strategy. Another Example: Merge Sort. Merge Sort Example. Merge Sort Example

Algorithms, CSE, OSU Quicksort. Instructor: Anastasios Sidiropoulos

An analogy from Calculus: limits

CS 2110: INDUCTION DISCUSSION TOPICS

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

CS173 Running Time and Big-O. Tandy Warnow

Data Structures and Algorithms Chapter 3

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences

CS483 Design and Analysis of Algorithms

Outline. 1 Introduction. Merging and MergeSort. 3 Analysis. 4 Reference

Data structures Exercise 1 solution. Question 1. Let s start by writing all the functions in big O notation:

Fundamental Algorithms

Fundamental Algorithms

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:

V. Adamchik 1. Recurrences. Victor Adamchik Fall of 2005

COMP 9024, Class notes, 11s2, Class 1

1 Quick Sort LECTURE 7. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

Data Structures and Algorithm Analysis (CSC317) Randomized Algorithms (part 3)

Fast Sorting and Selection. A Lower Bound for Worst Case

Data Structures and Algorithms Chapter 3

Introduction to Divide and Conquer

CIS 121. Analysis of Algorithms & Computational Complexity. Slides based on materials provided by Mary Wootters (Stanford University)

IS 709/809: Computational Methods in IS Research Fall Exam Review

Divide and Conquer. Andreas Klappenecker. [based on slides by Prof. Welch]

Divide and Conquer. Recurrence Relations

Topic 4 Randomized algorithms

COMP 382: Reasoning about algorithms

Algorithm Analysis Recurrence Relation. Chung-Ang University, Jaesung Lee

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

Divide and Conquer Algorithms

MAKING A BINARY HEAP

Data Structures and Algorithms CSE 465

CSE 591 Homework 3 Sample Solutions. Problem 1

Algorithm Design and Analysis

CS361 Homework #3 Solutions

5. DIVIDE AND CONQUER I

Algorithms Test 1. Question 1. (10 points) for (i = 1; i <= n; i++) { j = 1; while (j < n) {

Welcome to CSE21! Lecture B Miles Jones MWF 9-9:50pm PCYN 109. Lecture D Russell (Impagliazzo) MWF 4-4:50am Center 101

MAKING A BINARY HEAP

CS 470/570 Divide-and-Conquer. Format of Divide-and-Conquer algorithms: Master Recurrence Theorem (simpler version)

Transcription:

Divide and Conquer Algorithms CSE 101: Design and Analysis of Algorithms Lecture 14

CSE 101: Design and analysis of algorithms Divide and conquer algorithms Reading: Sections 2.3 and 2.4 Homework 6 will be assigned today Due Nov 20, 11:59 PM CSE 101, Fall 2018 2

Warning If you have an algorithm that has a recursion such as T n = at n b + O n d This is sometimes referred to as a reduce and conquer. Be careful of this type of recursion because if a is greater than 1, then the algorithm may take exponential time. Based on slides courtesy of Miles Jones CSE 101, Fall 2018 3

Search Given a sorted list of integers and a target integer, determine the index of the target if it is in the list What is the fastest runtime that we could hope for? We have two options for each number we compare to the target (binary tree) CSE 101, Fall 2018 4

Search Given a sorted list of integers and a target integer, determine the index of the target if it is in the list What is the fastest runtime that we could hope for? We have two options for each number we compare to the target (binary tree) CSE 101, Fall 2018 5

Search Sorted list of n integers: 4, 29, 31, 52, 79, 82, 83, 99, 100, 111, 142, 153, 160, 169, 181 CSE 101, Fall 2018 6

Search Sorted list of n integers: 4, 29, 31, 52, 79, 82, 83, 99, 100, 111, 142, 153, 160, 169, 181 Binary tree of n nodes CSE 101, Fall 2018 7

Search Sorted list of n integers: 4, 29, 31, 52, 79, 82, 83, 99, 100, 111, 142, 153, 160, 169, 181 Binary tree of n nodes 99 CSE 101, Fall 2018 8

Search Sorted list of n integers: 4, 29, 31, 52, 79, 82, 83, 99, 100, 111, 142, 153, 160, 169, 181 Binary tree of n nodes 99 52 153 CSE 101, Fall 2018 9

Search Sorted list of n integers: 4, 29, 31, 52, 79, 82, 83, 99, 100, 111, 142, 153, 160, 169, 181 Binary tree of n nodes 99 52 153 29 82 111 169 CSE 101, Fall 2018 10

Search Sorted list of n integers: 4, 29, 31, 52, 79, 82, 83, 99, 100, 111, 142, 153, 160, 169, 181 Binary tree of n nodes 99 52 153 29 82 111 169 4 31 79 83 100 142 160 181 CSE 101, Fall 2018 11

Search Sorted list of n integers: 4, 29, 31, 52, 79, 82, 83, 99, 100, 111, 142, 153, 160, 169, 181 Binary tree of n nodes 99 52 153 29 82 111 169 4 31 79 83 100 142 160 181 CSE 101, Fall 2018 12

Search What is the shortest height possible for a binary tree of n nodes? 99 52 153 29 82 111 169 4 31 79 83 100 142 160 181 Any search algorithm takes Ω log n CSE 101, Fall 2018 13

Search What is the shortest height possible for a binary tree of n nodes? 99 52 153 29 82 111 169 log n 4 31 79 83 100 142 160 181 Any search algorithm takes Ω log n CSE 101, Fall 2018 14

Search What is the shortest height possible for a binary tree of n nodes? 99 52 153 29 82 111 169 log n 4 31 79 83 100 142 160 181 Any search algorithm takes Ω log n CSE 101, Fall 2018 15

Divide and conquer search Starting with a sorted list of n integers and a target, the goal is to output the index of the target if it is in the list Break a problem into similar subproblems Split the list into two sublists each of half the size Solve each subproblem recursively Recursively search one of the lists Combine (non-recursive part) Compare and add index, if necessary Time complexity? CSE 101, Fall 2018 16

Divide and conquer search Starting with a sorted list of n integers and a target, the goal is to output the index of the target if it is in the list Break a problem into similar subproblems Split the list into two sublists each of half the size Solve each subproblem recursively Recursively search one of the lists Combine (non-recursive part) Compare and add index, if necessary Time complexity? CSE 101, Fall 2018 17

Divide and conquer search Starting with a sorted list of n integers and a target, the goal is to output the index of the target if it is in the list Break a problem into similar subproblems Split the list into two sublists each of half the size Solve each subproblem recursively Recursively search one of the lists Combine (non-recursive part) Compare and add index, if necessary Time complexity? CSE 101, Fall 2018 18

Divide and conquer search Starting with a sorted list of n integers and a target, the goal is to output the index of the target if it is in the list Break a problem into similar subproblems Split the list into two sublists each of half the size Solve each subproblem recursively Recursively search one of the lists Combine (non-recursive part) Compare and add index, if necessary Time complexity? 1 O 1 T n 2 T n = T n 2 + O(1) CSE 101, Fall 2018 19

Degenerate divide and conquer The important aspect of divide and conquer is that the input size decreases in the recursive call, hopefully by a constant factor, not just one or two smaller Sometimes this can even happen when there is only one recursive call Let s call this the degenerate case, because we are not really dividing the input into parts CSE 101, Fall 2018 20

Master theorem applied to binary search The recursion for the runtime is T(n) = T(n/2) + O(1) T(n) = at(n/b) + O(n d ) O n d if a < b d T n ϵ O n d log n if a = b d O n log b a if a > b d So, we have that a = 1, b = 2, and d = 0. In this case, a = b d so T n ϵo n 0 log n = O log n CSE 101, Fall 2018 21

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort CSE 101, Fall 2018 22

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort CSE 101, Fall 2018 23

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort O(n 2 ) CSE 101, Fall 2018 24

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort O(n 2 ) CSE 101, Fall 2018 25

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort O(n 2 ) O(n 2 ) CSE 101, Fall 2018 26

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort O(n 2 ) O(n 2 ) CSE 101, Fall 2018 27

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort O(n 2 ) O(n 2 ) O(n 2 ) CSE 101, Fall 2018 28

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort O(n 2 ) O(n 2 ) O(n 2 ) CSE 101, Fall 2018 29

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort O(n 2 ) O(n 2 ) O(n 2 ) O(n log n) CSE 101, Fall 2018 30

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort O(n 2 ) O(n 2 ) O(n 2 ) O(n log n) CSE 101, Fall 2018 31

Sorting Average time complexity? Bubble sort Insert sort Selection sort Quicksort Merge sort O(n 2 ) O(n 2 ) O(n 2 ) O(n log n) O(n log n) CSE 101, Fall 2018 32

Sorting Average time complexity Bubble sort Insert sort Selection sort Quicksort Merge sort O(n 2 ) O(n 2 ) O(n 2 ) O(n log n) O(n log n) CSE 101, Fall 2018 33

Sorting How long should it take to sort things? If we must sort things based on comparisons, then we must travel down a path in this tree n = 3 CSE 101, Fall 2018 34

Sorting How many leaves does the tree have? What is the height? n = 3 CSE 101, Fall 2018 35

Sorting How many leaves does the tree have? What is the height? n! = 6 n = 3 CSE 101, Fall 2018 36

Sorting How many leaves does the tree have? What is the height? n! = 6 n = 3 CSE 101, Fall 2018 37

Sorting How many leaves does the tree have? What is the height? log n! = 2.59 n! = 6 n = 3 log n! CSE 101, Fall 2018 38

Sorting How many leaves does the tree have? What is the height? log n! n! Any sorting algorithm that relies on comparisons between elements runs in Ω(log n! ) time CSE 101, Fall 2018 39

Sorting How many leaves does the tree have? What is the height? log n! n! Any sorting algorithm that relies on comparisons between elements runs in Ω(log n! ) time CSE 101, Fall 2018 40

Sorting lower bound runtime Any sorting algorithm that relies on comparisons between elements runs in Ω(log n! ) time Is there a simpler expression than log n!? log n! = log n + log n 1 + + log 2 + log 1 < n log n n log n! = log k k=1 = Θ n log n CSE 101, Fall 2018 41

Sorting lower bound runtime Any sorting algorithm that relies on comparisons between elements runs in Ω(log n! ) time Is there a simpler expression than log n!? log n! = log n + log n 1 + + log 2 + log 1 < n log n n log n! = log k k=1 = Θ n log n CSE 101, Fall 2018 42

Sorting lower bound runtime Any sorting algorithm that relies on comparisons between elements runs in Ω(log n! ) time Is there a simpler expression than log n!? log n! = log n + log n 1 + + log 2 + log 1 < n log n n log n! = log k k=1 = Θ n log n CSE 101, Fall 2018 43

Sorting lower bound runtime Any sorting algorithm that relies on comparisons between elements runs in Ω(log n! ) time Is there a simpler expression than log n!? log n! = log n + log n 1 + + log 2 + log 1 < n log n n log n! = log k k=1 = Θ n log n CSE 101, Fall 2018 44

Sorting lower bound runtime What is the fastest possible worst case for any sorting algorithm? log n! How big is that? n log x dx 1 n log n! = log k k=1 = x log x x 1 n = n log n n 0 1 = n log n n + 1 log n! = Ω n log n CSE 101, Fall 2018 45

Sorting lower bound runtime What is the fastest possible worst case for any sorting algorithm? log n! How big is that? n log x dx 1 n log n! = log k k=1 = x log x x 1 n = n log n n 0 1 = n log n n + 1 log n! = Ω n log n CSE 101, Fall 2018 46

Sorting lower bound runtime What is the fastest possible worst case for any sorting algorithm? log n! How big is that? 1 n log k log x dx n log n! = log k k=1 = x log x x 1 n = n log n n 0 1 = n log n n + 1 log n! k = Ω n log n CSE 101, Fall 2018 47

Sorting lower bound runtime What is the fastest possible worst case for any sorting algorithm? log n! How big is that? n log x dx 1 n log n! = log k k=1 = x log x x 1 n = n log n n 0 1 = n log n n + 1 log n! = Ω n log n CSE 101, Fall 2018 48

Sorting lower bound runtime What is the fastest possible worst case for any sorting algorithm? log n! How big is that? n log x dx 1 n log n! = log k k=1 = x log x x 1 n = n log n n 0 1 = n log n n + 1 log n! = Ω n log n CSE 101, Fall 2018 49

Sorting lower bound runtime Any sorting algorithm that relies on comparisons between elements runs in Ω(log n! ) time There are sorting algorithms that run faster, but they rely on some prior knowledge about the elements, e.g., if you know the values are only allowed to come from a small range of values CSE 101, Fall 2018 50

Sorting 1, 2, 3 elements Sorting 1 element takes log 1! = 0 comparisons Sorting 2 elements takes log 2! = 1 comparison Sorting 3 elements should take log 3! = 3 comparisons You could compare every pair of elements 3 2 = 3 comparisons Or by the tree CSE 101, Fall 2018 51

Sorting 4 elements Sorting 4 elements should take log 4! = 5 comparisons You could compare every pair of elements 4 2 = 6 comparisons Exercise: sort 4 elements only using 5 comparisons CSE 101, Fall 2018 52

Divide and conquer sort Starting with a list of integers, the goal is to output the list in sorted order Break a problem into similar subproblems Split the list into two sublists each of half the size Solve each subproblem recursively Recursively sort the two sublists Combine Put the two sorted sublists together to create a sorted list of all the elements CSE 101, Fall 2018 53

Divide and conquer sort Starting with a list of integers, the goal is to output the list in sorted order Break a problem into similar subproblems Split the list into two sublists each of half the size Solve each subproblem recursively Recursively sort the two sublists Combine Put the two sorted sublists together to create a sorted list of all the elements CSE 101, Fall 2018 54

mergesort algorithm function mergesort(a 1 n ) if n > 1: ML = mergesort a 1 n 2 MR = mergesort a n 2 return merge(ml,mr) else: return a + 1, n CSE 101, Fall 2018 55

mergesort algorithm, runtime function mergesort(a 1 n ) if n > 1: ML = mergesort a 1 n 2 MR = mergesort a n + 1, n 2 return merge(ml,mr) else: return a Suppose mergesort runs in T(n) time for inputs of length n. Then, each recursive call runs in T n 2 time and merge runs in O k + l time where k, l = n/2, so merge runs in O(n) time. CSE 101, Fall 2018 56

mergesort algorithm, runtime function mergesort(a 1 n ) if n > 1: ML = mergesort a 1 n 2 MR = mergesort a n + 1, n 2 return merge(ml,mr) else: return a O(n) T(n) T n 2 T n 2 Suppose mergesort runs in T(n) time for inputs of length n. Then, each recursive call runs in T n 2 time and merge runs in O k + l time where k, l = n/2, so merge runs in O(n) time. CSE 101, Fall 2018 57

mergesort algorithm, runtime function mergesort(a 1 n ) if n > 1: ML = mergesort a 1 n 2 MR = mergesort a n + 1, n 2 return merge(ml,mr) else: return a O(n) T(n) T n 2 T n 2 Suppose mergesort runs in T(n) time for inputs of length n. Then, each recursive call runs in T n 2 time and merge runs in O k + l time where k, l = n/2, so merge runs in O(n) time. T n = 2T n 2 + O n CSE 101, Fall 2018 58

mergesort algorithm, runtime function mergesort(a 1 n ) if n > 1: ML = mergesort a 1 n 2 MR = mergesort a n + 1, n 2 return merge(ml,mr) else: return a O(n) T(n) T n 2 T n 2 Suppose mergesort runs in T(n) time for inputs of length n. Then, each recursive call runs in T n 2 time and merge runs in O k + l time where k, l = n/2, so merge runs in O(n) time. T n = 2T n + O n 2 T n = O n log n by the master theorem CSE 101, Fall 2018 59

mergesort algorithm, correctness Correctness of divide and conquer algorithms is generally very easy. We use strong induction. Base case: n = 1, mergesort returns the original array a which is trivially sorted Inductive hypothesis: suppose that for some n 1, mergesort(a[1 k]) outputs the elements of a in sorted order on all inputs of size k where 1 k n. We want to show that it works for inputs of size n + 1. CSE 101, Fall 2018 60

mergesort algorithm, correctness Correctness of divide and conquer algorithms is generally very easy. We use strong induction. Base case: n = 1, mergesort returns the original array a which is trivially sorted Inductive hypothesis: suppose that for some n 1, mergesort(a[1 k]) outputs the elements of a in sorted order on all inputs of size k where 1 k n. We want to show that it works for inputs of size n + 1. CSE 101, Fall 2018 61

mergesort algorithm, correctness Correctness of divide and conquer algorithms is generally very easy. We use strong induction. Base case: n = 1, mergesort returns the original array a which is trivially sorted Inductive hypothesis: suppose that for some n 1, mergesort(a[1 k]) outputs the elements of a in sorted order on all inputs of size k where 1 k n. We want to show that it works for inputs of size n + 1. CSE 101, Fall 2018 62

mergesort algorithm, correctness Inductive hypothesis: suppose that for some n 1, mergesort(a[1 k]) outputs the elements of a in sorted order on all inputs of size k where 1 k n. We want to show that it works for inputs of size n + 1. Inductive step: since n 1 mergesort(a[1,, n + 1]) returns merge(ml,mr) where ML = mergesort a 1 n+1 and MR = mergesort a n+1 2 + 1, n + 1 Since n+1 n, the inductive hypothesis ensures that ML and MR are 2 sorted. And, merge combines two sorted lists so the algorithm returns the elements in sorted order. 2 CSE 101, Fall 2018 63

mergesort algorithm, correctness Inductive hypothesis: suppose that for some n 1, mergesort(a[1 k]) outputs the elements of a in sorted order on all inputs of size k where 1 k n. We want to show that it works for inputs of size n + 1. Inductive step: since n 1 mergesort(a[1,, n + 1]) returns merge(ml,mr) where ML = mergesort a 1 n+1 and MR = mergesort a n+1 2 + 1, n + 1 Since n+1 n, the inductive hypothesis ensures that ML and MR are 2 sorted. And, merge combines two sorted lists so the algorithm returns the elements in sorted order. 2 CSE 101, Fall 2018 64

mergesort algorithm, correctness Inductive hypothesis: suppose that for some n 1, mergesort(a[1 k]) outputs the elements of a in sorted order on all inputs of size k where 1 k n. We want to show that it works for inputs of size n + 1. Inductive step: since n 1 mergesort(a[1,, n + 1]) returns merge(ml,mr) where ML = mergesort a 1 n+1 and MR = mergesort a n+1 2 + 1, n + 1 Since n+1 n, the inductive hypothesis ensures that ML and MR are 2 sorted. And, merge combines two sorted lists so the algorithm returns the elements in sorted order. 2 CSE 101, Fall 2018 65

Median The median of a list of numbers is the middle number in the sorted list. If the list has n values and n is odd, then the middle element is clear. It is the n/2 th smallest element. Example med 8,2,9,11,4 = 8 because n = 5 and 8 is the 3rd = 5/2 th smallest element of the list CSE 101, Fall 2018 66

Median The median of a list of numbers is the middle number in the list. If the list has n values and n is even, then there are two middle elements. Let s say that the median is the n/2th smallest element. Then, the median is also the n/2 th smallest element. Example med 10,23,7,26,17,3 = 10 because n = 6 and 10 is the 3rd = 6/2 th smallest element of the list CSE 101, Fall 2018 67

Median The purpose of the median is to summarize a set of numbers. The average is also a commonly used value. The median is more typical of the data. For example, suppose in a company with 20 employees, the CEO makes $1 million and all the other workers each make $50,000 Then the average is $97,500 and the median is $50,000, which is much closer to the typical worker s salary CSE 101, Fall 2018 68

Median, algorithm Can you think of an efficient way to find the median? How long would it take? Is there a lower bound on the runtime of all median selection algorithms? Sort the list then find the n/2 th element O n log n You can never have a faster runtime than O(n) because you at least have to look at every element All selection algorithms are Ω(n) CSE 101, Fall 2018 69

Median, algorithm Can you think of an efficient way to find the median? How long would it take? Is there a lower bound on the runtime of all median selection algorithms? Sort the list then find the n/2 th element O n log n You can never have a faster runtime than O(n) because you at least have to look at every element All selection algorithms are Ω(n) CSE 101, Fall 2018 70

Median, algorithm Can you think of an efficient way to find the median? How long would it take? Is there a lower bound on the runtime of all median selection algorithms? Sort the list then find the n/2 th element O n log n You can never have a faster runtime than O(n) because you at least have to look at every element All selection algorithms are Ω(n) CSE 101, Fall 2018 71

Selection What if we designed an algorithm that takes as input a list of numbers of length n and an integer 1 k n and outputs the kth smallest integer in the list Then we could just plug in n/2 for k and we could find the median! CSE 101, Fall 2018 72

Divide and conquer selection Let s think about selection in a divide and conquer type of way Break a problem into similar subproblems Split the list into two sublists Solve each subproblem recursively Recursively select from one of the sublists Combine Determine how to split the list again CSE 101, Fall 2018 73

Divide and conquer selection Let s think about selection in a divide and conquer type of way Break a problem into similar subproblems Split the list into two sublists Solve each subproblem recursively Recursively select from one of the sublists Combine Determine how to split the list again CSE 101, Fall 2018 74

Divide and conquer selection How would you split the list? Just splitting the list down the middle does not help so much What we will do is pick a random pivot and split the list into all integers greater than the pivot and all that are less than the pivot Then, we can determine which list to look in to find the kth smallest element Note the value of k may change depending on which list we are looking in CSE 101, Fall 2018 75

Divide and conquer selection Example Selection([40,31,6,51,76,58,97,37,86,31,19,30,68],7) Pick a random pivot, say 31. Then, divide the list into three groups SL, Sv, SR such that SL contains all elements smaller than 31, Sv is all elements equal to 31, and SR is all elements greater than 31. SL=[6,19,30], size = 3 Sv=[31,31], size = 2 SR=[40,51,76,58,97,37,86,68], size = 8 CSE 101, Fall 2018 76

Divide and conquer selection Example Selection([40,31,6,51,76,58,97,37,86,31,19,30,68],7) Pick a random pivot, say 31. Then, divide the list into three groups SL, Sv, SR such that SL contains all elements smaller than 31, Sv is all elements equal to 31, and SR is all elements greater than 31. SL=[6,19,30], size = 3 Sv=[31,31], size = 2 SR=[40,51,76,58,97,37,86,68], size = 8 CSE 101, Fall 2018 77

Divide and conquer selection Example Selection([40,31,6,51,76,58,97,37,86,31,19,30,68],7) SL=[6,19,30], size = 3 Sv=[31,31], size = 2 SR=[40,51,76,58,97,37,86,68], size = 8 Since k=7 is bigger than the size of SL, we know the kth biggest element cannot be in SL. Since k=7 it is bigger than size of SL plus size of Sv, it cannot be in Sv either. Therefore it must be in SR. So the 7 th biggest element in the original list is what number in SR? CSE 101, Fall 2018 78

Divide and conquer selection Example Selection([40,31,6,51,76,58,97,37,86,31,19,30,68],7) SL=[6,19,30], size = 3 Sv=[31,31], size = 2 SR=[40,51,76,58,97,37,86,68], size = 8 Since k=7 is bigger than the size of SL, we know the kth biggest element cannot be in SL. Since k=7 it is bigger than size of SL plus size of Sv, it cannot be in Sv either. Therefore it must be in SR. So the 7 th biggest element in the original list is what number in SR? CSE 101, Fall 2018 79

Divide and conquer selection Example Selection([40,31,6,51,76,58,97,37,86,31,19,30,68],7) SL=[6,19,30], size = 3 Sv=[31,31], size = 2 SR=[40,51,76,58,97,37,86,68], size = 8 Since k=7 is bigger than the size of SL, we know the kth biggest element cannot be in SL. Since k=7 it is bigger than size of SL plus size of Sv, it cannot be in Sv either. Therefore it must be in SR. So the 7 th biggest element in the original list is what number in SR? 7-3-2 = 2 CSE 101, Fall 2018 80

Divide and conquer selection Example Selection([40,31,6,51,76,58,97,37,86,31,19,30,68],7) SL=[6,19,30], size = 3 Sv=[31,31], size = 2 SR=[40,51,76,58,97,37,86,68], size = 8 So, the 7 th biggest element in the original list is the 2 nd biggest in SR? Selection([40,31,6,51,76,58,97,37,86,31,19,30,68],7) = Selection ([40,51,76,58,97,37,86,68],2) = 40 CSE 101, Fall 2018 81

Selection algorithm Input: list of n integers and integer k, where 1 k n Output: the kth biggest number in the set of integers function Selection(a[1 n],k) if n==1: return a[1] pick a random integer in the list v Split the list into sets SL, Sv, SR if k SL : return Selection(SL,k) else if k SL + Sv : return v else: return Selection(SR, k- SL - Sv ) CSE 101, Fall 2018 82

Selection algorithm, runtime Input: list of n integers and integer k, where 1 k n Output: the kth biggest number in the set of integers function Selection(a[1 n],k) T(n) if n==1: O(1) return a[1] pick a random integer in the list v O(1) Split the list into sets SL, Sv, SR O(n) if k SL : return Selection(SL,k) T( SL ) else if k SL + Sv : O(1) return v else: T( SR ) return Selection(SR, k- SL - Sv ) CSE 101, Fall 2018 83

Selection algorithm, runtime Input: list of n integers and integer k, where 1 k n Output: the kth biggest number in the set of integers function Selection(a[1 n],k) T(n) if n==1: O(1) return a[1] pick a random integer in the list v O(1) Split the list into sets SL, Sv, SR O(n) if k SL : return Selection(SL,k) T( SL ) else if k SL + Sv : O(1) return v else: T( SR ) return Selection(SR, k- SL - Sv ) The runtime is dependent on SL and SR CSE 101, Fall 2018 84

Selection algorithm, runtime The runtime is dependent on SL and SR If we were so lucky as to choose v to be close to the median every time, then SL SR n/2. And so, no matter which set we recurse on, T n = T n + O n 2 T n = O n CSE 101, Fall 2018 85

Selection algorithm, runtime The runtime is dependent on SL and SR If we were so lucky as to choose v to be close to the median every time, then SL SR n/2. And so, no matter which set we recurse on, T n = T n + O n 2 T n = O n by the master theorem CSE 101, Fall 2018 86

Selection algorithm, runtime The runtime is dependent on SL and SR Conversely, if we were so unlucky as to choose v to be the maximum (or minimum) then SL (or SR ) = n-1 and T n = T n 1 + O n T n = O n 2 Which is worse than sorting then finding Although there is a chance of having a high runtime, is it still worth it? CSE 101, Fall 2018 87

Selection algorithm, runtime The runtime is dependent on SL and SR Conversely, if we were so unlucky as to choose v to be the maximum (or minimum) then SL (or SR ) = n-1 and T n = T n 1 + O n T n = O n 2 Which is worse than sorting then finding Although there is a chance of having a high runtime, is it still worth it? CSE 101, Fall 2018 88

Selection algorithm, runtime The runtime is dependent on SL and SR Conversely, if we were so unlucky as to choose v to be the maximum (or minimum) then SL (or SR ) = n-1 and T n = T n 1 + O n T n = O n 2 Which is worse than sorting then finding Although there is a chance of having a high runtime, is it still worth it? CSE 101, Fall 2018 89

Expected runtime n-1 n-i i 0 n-1 If you randomly select the ith element, then your list will be split into a list of length i and a list of length n-i So when we recurse on the smaller list, it will take time proportional to max (i, n i) 0 0 i n-1 CSE 101, Fall 2018 90

Expected runtime n-1 n-i i 0 n-1 SR SL If you randomly select the ith element, then your list will be split into a list of length i and a list of length n-i So when we recurse on the smaller list, it will take time proportional to max (i, n i) 0 0 i n-1 CSE 101, Fall 2018 91

Expected runtime n-1 n 2 0 n-1 Clearly, the split with the smallest maximum size is when i=n/2 and worst case is i=n or i=1 0 n 2 0 n-1 CSE 101, Fall 2018 92

Expected runtime n-1 n-i i 0 n-1 What is the expected runtime? Well, what is our random variable? For each input and sequence of random choices of pivots, the random variable is the runtime of that particular outcome 0 0 i n-1 CSE 101, Fall 2018 93

Expected runtime n-1 n-i i 0 n-1 So, if we want to find the expected runtime, then we must sum over all possibilities of choices Let ET n be the expected runtime. Then ET n = 1 n n i=1 ET max i, n i + O n 0 0 i n-1 CSE 101, Fall 2018 94

Expected runtime n-1 3n 4 0 n-1 What is the probability of choosing a value from 1 to n in the interval n 4, 3n 4 likely? if all values are equally 0 0 n 4 3n 4 n-1 CSE 101, Fall 2018 95

Expected runtime n-1 3n 4 0 n-1 If you did choose a value between n/4 and 3n/4 then the sizes of the subproblems would both be 3n 4 Otherwise, the subproblems would be n So, we can compute an upper bound on the expected runtime. ET n 1 2 ET 3n 4 + 1 ET n + O(n) 2 0 0 n 4 3n 4 n-1 CSE 101, Fall 2018 96

Expected runtime n-1 3n 4 0 n-1 If you did choose a value between n/4 and 3n/4 then the sizes of the subproblems would both be 3n 4 Otherwise, the subproblems would be n So, we can compute an upper bound on the expected runtime. ET n 1 2 ET 3n 4 + 1 ET n + O(n) 2 0 0 n 4 3n 4 n-1 CSE 101, Fall 2018 97

Expected runtime n-1 3n 4 0 n-1 ET n 1 2 ET 3n 4 + 1 ET n + O(n) 2 ET n ET 3n 4 + O(n) Plug into the master theorem with a=1, b=4/3, d=1 a<b d so ET n O(n) 0 0 n 4 3n 4 n-1 CSE 101, Fall 2018 98

Quicksort What have we noticed about the partitioning part of Selection? After partitioning, the pivot is in its correct position in sorted order Quicksort takes advantage of that CSE 101, Fall 2018 99

Divide and conquer quicksort Let s think about selection in a divide and conquer type of way Break a problem into similar subproblems Split the list into two sublists by partitioning a pivot Solve each subproblem recursively Recursively sort each sublist Combine Concatenate the lists CSE 101, Fall 2018 100

Divide and conquer quicksort procedure quicksort(a[1 n]) if n 1: return a set v to be a random element in a partition a into SL,Sv,SR return quicksort(sl) Sv quicksort(sr) CSE 101, Fall 2018 101

Divide and conquer quicksort, runtime procedure quicksort(a[1 n]) if n 1: return a set v to be a random element in a partition a into SL,Sv,SR O(n) return quicksort(sl) Sv quicksort(sr) T( SL ) T( SR ) T n = T SL + T SR + O(n) CSE 101, Fall 2018 102

Divide and conquer quicksort, runtime ET n = 1 n n i=1 ET n = 2 n ET n ET n i n i=1 SR + ET i ET i + O(n) = O(n log n) SL Exercise + O(n) CSE 101, Fall 2018 103

Selection (deterministic) Sometimes this algorithm we have described is called quick select because generally it is a very practical linear expected time algorithm. This algorithm is used in practice. For theoretic computer scientists, it is unsatisfactory to only have a randomized algorithm that could run in quadratic time Blum, Floyd, Pratt, Rivest, and Tarjan have developed a deterministic approach to finding the median (or any kth biggest element They use a divide and conquer strategy to find a number close to the median and then use that to pivot the values CSE 101, Fall 2018 104

Selection (deterministic) The strategy is to split the list into sets of 5 and find the medians of all those sets. Then, find the median of the medians using a recursive call T(n/5). Then, partition the set just like in quickselect and recurse on SR or SL just like in quickselect CSE 101, Fall 2018 105

Median of medians MofM(L,k) If L has 10 or fewer elements: Sort(L) and return the kth element Partition L into sublists S[i] of five elements each For i = 1, n/5 m i =MofM(S[i],3) M = MofM([m 1,, m[n/5]],n/10) Split the list into sets SL, SM, SR. if k SL : return Selection(SL,k) else if k SL + Sv : else: return v return Selection(SR, k- SL - Sv ) CSE 101, Fall 2018 106

Median of medians, runtime MofM(L,k) If L has 10 or fewer elements: Sort(L) and return the kth element Partition L into sublists S[i] of five elements each For i = 1, n/5 m i =MofM(S[i],3) M = MofM([m 1,, m[n/5]],n/10) Split the list into sets SL, SM, SR. if k SL : return Selection(SL,k) else if k SL + Sv : else: return v T(n) O(1) return Selection(SR, k- SL - Sv ) n/5 iterations, so O(n) T(n/5) CSE 101, Fall 2018 107

Median of medians, runtime MofM(L,k) If L has 10 or fewer elements: Sort(L) and return the kth element Partition L into sublists S[i] of five elements each For i = 1, n/5 m i =MofM(S[i],3) M = MofM([m 1,, m[n/5]],n/10) Split the list into sets SL, Sv, SR if k SL : return MofM(SL,k) else if k SL + Sv : else: return v T(n) O(1) T( SL ) return MofM(SR, k- SL - Sv ) n/5 iterations, so O(n) T( SR ) T(n/5) T 7n 10 SL 7n 10 SR 7n 10 CSE 101, Fall 2018 108

Selection (deterministic) By construction, it can be shown that SR <7n/10 and SL <7n/10 and so no matter which set we recurse on, we have T n = T n 5 + T 7n 10 + O(n) T n T 9n + O n 10 T n O(n) You cannot use the master theorem to solve this, but you can use induction to show that if T(n) cn for some c, then T(n) cn And so we have a linear time selection algorithm! CSE 101, Fall 2018 109

Next lecture Divide and conquer algorithms Reading: Chapter 2 CSE 101, Fall 2018 110