Minilecture 2: Sorting Algorithms

Similar documents
Lecture 14: The Rank-Nullity Theorem

Lecture 4: Constructing the Integers, Rationals and Reals

Lecture 2: Mutually Orthogonal Latin Squares and Finite Fields

Lecture 8: A Crash Course in Linear Algebra

Expected Value II. 1 The Expected Number of Events that Happen

Extended Algorithms Courses COMP3821/9801

Lecture 1: Period Three Implies Chaos

Lecture 3: Sizes of Infinity

Lecture 6: Lies, Inner Product Spaces, and Symmetric Matrices

Lecture 4. Quicksort

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

MI 4 Mathematical Induction Name. Mathematical Induction

Lecture 10: Everything Else

ASYMPTOTIC COMPLEXITY SEARCHING/SORTING

Lecture 12 : Recurrences DRAFT

Lecture 19: The Determinant

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

CSCI 3110 Assignment 6 Solutions

Lecture 6: Finite Fields

Lecture 5. 1 Review (Pairwise Independence and Derandomization)

P (E) = P (A 1 )P (A 2 )... P (A n ).

Q: How can quantum computers break ecryption?

Proof Techniques (Review of Math 271)

1 Quick Sort LECTURE 7. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

Homework Set #1 Solutions

An analogy from Calculus: limits

Lecture 3: Latin Squares and Groups

Algebra Exam. Solutions and Grading Guide

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

CS361 Homework #3 Solutions

1 Divide and Conquer (September 3)

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

Cosets and Lagrange s theorem

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

Lecture 3: Big-O and Big-Θ

Quicksort (CLRS 7) We previously saw how the divide-and-conquer technique can be used to design sorting algorithm Merge-sort

ITEC2620 Introduction to Data Structures

MATH 22 FUNCTIONS: ORDER OF GROWTH. Lecture O: 10/21/2003. The old order changeth, yielding place to new. Tennyson, Idylls of the King

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

1 Terminology and setup

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

HW2 Solutions, for MATH441, STAT461, STAT561, due September 9th

Quick Sort Notes , Spring 2010

Lecture 14, Thurs March 2: Nonlocal Games

Recitation 8: Graphs and Adjacency Matrices

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

CH 59 SQUARE ROOTS. Every positive number has two square roots. Ch 59 Square Roots. Introduction

TAYLOR POLYNOMIALS DARYL DEFORD

Private Key Cryptography. Fermat s Little Theorem. One Time Pads. Public Key Cryptography

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22

Lecture 4: Applications of Orthogonality: QR Decompositions

Fundamental Algorithms

CSE 21 Practice Exam for Midterm 2 Fall 2017

Fundamental Algorithms

Review for Exam Find all a for which the following linear system has no solutions, one solution, and infinitely many solutions.

Chapter 9: Roots and Irrational Numbers

Quicksort. Where Average and Worst Case Differ. S.V. N. (vishy) Vishwanathan. University of California, Santa Cruz

Design and Analysis of Algorithms

CS264: Beyond Worst-Case Analysis Lecture #4: Parameterized Analysis of Online Paging

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

Divide and Conquer Algorithms

Ratios, Proportions, Unit Conversions, and the Factor-Label Method

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

Lecture 10: Powers of Matrices, Difference Equations

The eigenvalues are the roots of the characteristic polynomial, det(a λi). We can compute

Turing Machines and Time Complexity

Bucket-Sort. Have seen lower bound of Ω(nlog n) for comparisonbased. Some cheating algorithms achieve O(n), given certain assumptions re input

Lecture 9: Taylor Series

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

(Refer Slide Time: 0:21)

5.2 Infinite Series Brian E. Veitch

Math Models of OR: Branch-and-Bound

Quicksort. Recurrence analysis Quicksort introduction. November 17, 2017 Hassan Khosravi / Geoffrey Tien 1

Notes on induction proofs and recursive definitions

Divide and conquer. Philip II of Macedon

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015

Randomized algorithms. Inge Li Gørtz

Lecture 2: Change of Basis

Matrix Multiplication

N/4 + N/2 + N = 2N 2.

Topic 4 Randomized algorithms

Divide-Conquer-Glue. Divide-Conquer-Glue Algorithm Strategy. Skyline Problem as an Example of Divide-Conquer-Glue

Lecture 2: Period Three Implies Chaos

Math 3361-Modern Algebra Lecture 08 9/26/ Cardinality

Divide and Conquer Algorithms

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Sorting and Searching. Tony Wong

P,NP, NP-Hard and NP-Complete

MA554 Assessment 1 Cosets and Lagrange s theorem

Induction 1 = 1(1+1) = 2(2+1) = 3(3+1) 2

At the start of the term, we saw the following formula for computing the sum of the first n integers:

1 Maintaining a Dictionary

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

STEP Support Programme. Hints and Partial Solutions for Assignment 17

CHM 105 & 106 UNIT TWO, LECTURE EIGHT 1 IN OUR PREVIOUS LECTURE WE WERE LOOKING AT CONCENTRATION UNITS FOR SOLUTIONS

COMPSCI 611 Advanced Algorithms Second Midterm Exam Fall 2017

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics

Basics of Proofs. 1 The Basics. 2 Proof Strategies. 2.1 Understand What s Going On

1 True/False. Math 10B with Professor Stankova Worksheet, Discussion #9; Thursday, 2/15/2018 GSI name: Roy Zhao

Transcription:

Math/CCS 10 Professor: Padraic Bartlett Minilecture 2: Sorting Algorithms Week 1 UCSB 201 On Monday s homework, we looked at an algorithm designed to sort a list! In this class, we will study several such algorithms, and study their runtimes. 1 Bubblesort The sorting algorithm from the HW, bubblesort, is a relatively classical and beautiful sorting algorithm: Algorithm. Take as input some list L = (l 1,... l n of objects, along with some operation that can compare any two of these elements. Perform the following algorithm: 1. Create a integer variable loc and a boolean variable didswap. 2. Set the variable loc to 1, and didswap to false.. While the variable loc is < n, perform the following steps: (a If l loc > l loc+1, swap the two elements l loc, l loc+1 in our list and set didswap to true. (b Regardless of whether you swapped elements or not, add 1 to the variable loc, and go to.. If didswap is true, then go to 2. 5. Otherwise, if didswap is false, we went through our entire list and never found any pair of consecutive elements such that the second was larger than the first. Therefore, our list is sorted! Output our list. A sample run of this algorithm on the list (1,,, 2, 5 is presented on the next page: 1

run step didswap loc L 1 2 false 1 (1,,, 2, 5 1 (a false 1 (1,,, 2, 5 1 (b false 2 (1,,, 2, 5 1 (a true 2 (1,,, 2, 5 1 (b true (1,,, 2, 5 1 (a true (1,, 2,, 5 1 (b true (1,, 2,, 5 1 (a true (1,, 2,, 5 1 (b true 5 (1,, 2,, 5 1 true 5 (1,, 2,, 5 2 2 false 1 (1,, 2,, 5 2 (a false 1 (1,, 2,, 5 2 (b false 2 (1,, 2,, 5 2 (a true 2 (1, 2,,, 5 2 (b true (1, 2,,, 5 2 (a true (1, 2,,, 5 2 (b true (1, 2,,, 5 2 (a true (1, 2,,, 5 2 (b true 5 (1, 2,,, 5 2 true 5 (1, 2,,, 5 2 false 1 (1, 2,,, 5 (a false 1 (1, 2,,, 5 (b false 2 (1, 2,,, 5 (a false 2 (1, 2,,, 5 (b false (1, 2,,, 5 (a false (1, 2,,, 5 (b false (1, 2,,, 5 (a false (1, 2,,, 5 (b false 5 (1, 2,,, 5 false 5 (1, 2,,, 5 5 (1, 2,,, 5 On the HW, you were asked to find the complexity of this algorithm, which (spoiler is O(n 2. Instead of studying this further, we re going to look at several other sorting algorithms! Here s a sorting algorithm that is slightly worse than bubblesort: bogosort! Algorithm. Take as input some list L = (l 1,... l n of objects, along with some operation that can compare any two of these elements. Perform the following algorithm: 1. Check this list to see if it is already in order. (Formally: create some variable loc, initialize it at 1, and go through the list comparing elements l loc, l loc+1. If there s any pair that are out of order, stop searching and declare the list out of order; otherwise, continue until you ve searched the whole list. 2. If the list was in order, stop! You re done. 2

. Otherwise, the list is out of order. Randomly select a permutation of our list, and reorder our list according to this permutation. Go to 1. This algorithm is akin to sorting a deck of cards by repeatedly (a checking if it s in order, (b throwing it up in the air if it s not, and repeating until it s sorted. If L s elements are all distinct, bogosort has an expected runtime of n!; there are n! many ways to permute the elements of our list, while exactly one of them is the correct sorting of L. Similarly, bogosort has a worst-case runtime of ; given any finite number of steps t, it s easily possible that our first t shuffles don t accidentally sort our list. The third sorting algorithm we study here is sleepsort, which is famous mostly for being the only sorting algorithm discovered by chan 1. It runs in linear time, for very stupidly defined values of linear, and we define it here: Algorithm. Take as input some list {l 1,... l n } of n distinct integers. Also, your system should have some sort of timekeeping mechanism (if you re doing this analog, you d want someone with a stopwatch. 0. For every number l i in your list, create a process that once we start our timekeeping mechanism will sleep for l i seconds, and then print itself. 1. Start our clock. 2. After t seconds, all of the numbers t have printed themselves, in order of their size. Consequently, after a number of seconds equal to the size of the largest element in your list, you ve written down all of the numbers in order! Win. Properties of sleepsort: Sleepsort s name comes from the sleep command-line program, which on input n creates a process that waits n seconds before doing anything else. The following script is an implementation of sleepsort: #!/bin/bash function f( { sleep "$1" echo "$1" } while [ -n "$1" ] do f "$1" & shift done wait Assuming no issues with parallelization, this takes as long to run as the largest element in our set. We close with a fourth sort that (while as esoteric in some ways as bogosort actually has some nice properties and uses: beadsort! 1 Don t look up chan.

Algorithm. Take as input some list L = (l 1,... l n of positive integers; as well, have on hand m = n i=1 l i many beads, and max n i=1 l i many poles on which to place these beads. 1. Label our poles 1... m. 2. For each element l i in our list, place one bead on each of the poles 1, 2... l i.. Let gravity occur.. Starting from height n and working your way down, read off the number of beads at that given height. Write down this number as l k. This algorithm is perhaps best illustrated with a pair of pictures: Setup: Row n 2... 1 Row 2 Row 1 Pole 1 Pole 2... Pole m Result: Row n... Row 2 Row 1 1 2 Pole 1 Pole 2... Pole m The beautiful thing about this algorithm is that its runtime is (time to place beads on poles + (time for gravity + (time to read beads. Gravity takes time roughly on the order of n, while the time to place beads on poles and read beads off of poles takes anywhere between n and m, depending on how efficient you are at it. In many cases, this can be much faster than n 2, when implemented by people or specialized hardware! We close here by mentioning a fourth kind of sorting algorithm: quicksort!

Algorithm. Take as input some list L = (l 1,... l n of positive integers. 1. Pick an element l k out of our list. Call this a pivot element. 2. Take the rest of our list, and split it into two lists: the list L < of elements less than our pivot, and the list L of elements that are greater than or equal to our pivot.. Quicksort the two lists L < and L. Note that this algorithm is recursive in nature: it calls lots of copies of itself to sort a list. The difficulty with quicksort implementations is in that word pick at the first step. Typically, as we will discuss in a bit, quicksort takes something like O(n log(n steps to sort our list; however, there are occasional lists where it takes n 2 -many steps to sort the list! In particular, if we have any deterministic method that s defined ahead of time for picking out elements, like (say always choose the first element of our list for the pivot, it s possible that the list we start from will make us take n runs to sort our list. For example, suppose that we started with the list (1, 2,... n: 1. At the start, we choose the element 1 as a pivot. We then split our list into two lists, L < and L : the first list, by definition, is empty, while the second list is (2,,... n. 2. We now quicksort these two lists. The empty set is trivially sorted! To quicksort the second list, we choose its first element (2 as a pivot, and split it into two lists L < = and L = (,,... n..... repeat this process! This clearly takes n passes, and therefore O(n 2 comparisons, to sort our list. Furthermore, there s no deterministic algorithm that doesn t consider our list that can avoid this problem: given any deterministic algorithm that will pick the k(l-th element out of a list L of n elements, we can put the smallest element of our list at the location k(l, the second-smallest element at the location corresponding to where k would pick an element out of the list L \ (smallest element, and so on/so forth 2 This is potentially problematic: if a malicious attacker knew our algorithm and wanted to make our system crash, it could intentionally feed us lots of worst-case scenario lists. How do we deal with this? Again, the answer to this problem is to use randomness! In particular, suppose that we choose our pivot element randomly. What is the expected number of passes we will have to make through any list? Well, one way to get a rough upper bound is the following: given a list L, call a pivot element p good if it s from the middle-half of the elements of L after sorting, and call it bad otherwise. The probability that a random pivot is good is clearly 1/2. Furthermore, if we pick such a good pivot, it will cut our list of n elements into two pieces, one of size at most n/ and the other of size at least n/; otherwise, if we pick a bad pivot, it s at the least the same as not doing anything (i.e. it leaves us with pretty much the same list. 2 However, there are deterministic algorithms that can take your list, read through it in O(n time, and create a set of good pivots for us to use! This is a second way around the problem we re describing here, and is pretty efficient, as adding O(n runtime to a O(n log(n algorithm doesn t change the overall nature of its complexity. See http://www.cc.gatech.edu/ mihail/mediancmu.pdf for a discussion of this. 5

Therefore, if we set T (n to be the expected amount of steps that it will take to randomquicksort a list with n elements, we have just shown that T (n n + 1 ( ( n ( n T + T + 1 2 2 T (n. This is because each choice of pivot takes n comparisons to determine how to split our sets, and we ve either chosen a good pivot (in which case we ve split our sets into two pieces or a bad pivot (in which case we ve not split our sets at all. Solving for T (n gives us T (n 2n + T ( n ( n + T. Using this, we can solve for an upper bound on T (n: in particular, if we guessed that T (n was something like an log(n, we could prove by induction that T (n an log(n. If a 2, this is certainly true for n = 2, as it takes us at most 2 steps to quicksort a list of two elements (pick a pivot, the other element is either less than or greater than it. So: if we inductively assume that it holds for all values n, we get T (n 2n + an log ( n + an ( n log ( (n =2n + an log + an ( n log ( (n =2n + an log n =2n + an ( log n =2n + an ( log + an log ( n = (2 + a ( log n + an log(n (2 + a (.97 n + an log(n So, if a 9, say, then 1 + a (.97 is less than 0, and therefore we have that this entire quantity is an log(n, which is what we wanted to prove. So this quicksort has an expected runtime that is O(n log(n: i.e. up to a constant, it runs like n log(n! And it s immune to someone intentionally creating bad lists, because it s randomized: no matter what list it s fed, it will expect to finish after 9n log(n steps. 6