Algorithms: COMP3121/3821/9101/9801

Similar documents
Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Greedy Algorithms. Kleinberg and Tardos, Chapter 4

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

CMSC 451: Lecture 7 Greedy Algorithms for Scheduling Tuesday, Sep 19, 2017

CSE 421 Greedy Algorithms / Interval Scheduling

CS 374: Algorithms & Models of Computation, Spring 2017 Greedy Algorithms Lecture 19 April 4, 2017 Chandra Chekuri (UIUC) CS374 1 Spring / 1

CSE101: Design and Analysis of Algorithms. Ragesh Jaiswal, CSE, UCSD

Greedy. Outline CS141. Stefano Lonardi, UCR 1. Activity selection Fractional knapsack Huffman encoding Later:

Algorithm Design and Analysis

CS325: Analysis of Algorithms, Fall Final Exam

Algorithm Design Strategies V

0 1 d 010. h 0111 i g 01100

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Lecture 9. Greedy Algorithm

Greedy Algorithms My T. UF

CS 6901 (Applied Algorithms) Lecture 3

CS60007 Algorithm Design and Analysis 2018 Assignment 1

Knapsack and Scheduling Problems. The Greedy Method

On Two Class-Constrained Versions of the Multiple Knapsack Problem

CS 580: Algorithm Design and Analysis

CS 6783 (Applied Algorithms) Lecture 3

CS 161: Design and Analysis of Algorithms

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

CSE 417. Chapter 4: Greedy Algorithms. Many Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

ECOM Discrete Mathematics

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016)

Data Structures in Java

CS 580: Algorithm Design and Analysis

Lecture 1 : Data Compression and Entropy

CSE 202 Dynamic Programming II

This means that we can assume each list ) is

CS Analysis of Recursive Algorithms and Brute Force

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Optimisation and Operations Research

1 Basic Definitions. 2 Proof By Contradiction. 3 Exchange Argument

Outline / Reading. Greedy Method as a fundamental algorithm design technique

Algorithms: COMP3121/3821/9101/9801

Announcements. CompSci 102 Discrete Math for Computer Science. Chap. 3.1 Algorithms. Specifying Algorithms

CSE 431/531: Analysis of Algorithms. Dynamic Programming. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

General Methods for Algorithm Design

Computation Theory Finite Automata

Coin Changing: Give change using the least number of coins. Greedy Method (Chapter 10.1) Attempt to construct an optimal solution in stages.

Dynamic Programming( Weighted Interval Scheduling)

The Greedy Method. Design and analysis of algorithms Cs The Greedy Method

CSE 421 Dynamic Programming

Lecture 6: Greedy Algorithms I

Boolean circuits. Lecture Definitions

CS 580: Algorithm Design and Analysis

Minimizing the Number of Tardy Jobs

1 Sequences and Summation

Algorithms Exam TIN093 /DIT602

CS781 Lecture 3 January 27, 2011

TRANSPORTATION & NETWORK PROBLEMS

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

14.1 Finding frequent elements in stream

(a) Write a greedy O(n log n) algorithm that, on input S, returns a covering subset C of minimum size.

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Dynamic Programming. Credits: Many of these slides were originally authored by Jeff Edmonds, York University. Thanks Jeff!

( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea

1 Introduction to information theory

Data Compression. Limit of Information Compression. October, Examples of codes 1

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.

Slides for CIS 675. Huffman Encoding, 1. Huffman Encoding, 2. Huffman Encoding, 3. Encoding 1. DPV Chapter 5, Part 2. Encoding 2

Do not turn this page until you have received the signal to start. Please fill out the identification section above. Good Luck!

6. DYNAMIC PROGRAMMING I

21. Dynamic Programming III. FPTAS [Ottman/Widmayer, Kap. 7.2, 7.3, Cormen et al, Kap. 15,35.5]

Algorithm Design and Analysis

What is Dynamic Programming

Greedy Algorithms and Data Compression. Curs 2017

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction.

Advanced Analysis of Algorithms - Midterm (Solutions)

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Proof methods and greedy algorithms

CSE 421 Weighted Interval Scheduling, Knapsack, RNA Secondary Structure

Friday Four Square! Today at 4:15PM, Outside Gates

Topics in Algorithms. 1 Generation of Basic Combinatorial Objects. Exercises. 1.1 Generation of Subsets. 1. Consider the sum:

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Data Compression Techniques

Transducers for bidirectional decoding of prefix codes

Recoverable Robustness in Scheduling Problems

Algorithm Design and Analysis

Q = Set of states, IE661: Scheduling Theory (Fall 2003) Primer to Complexity Theory Satyaki Ghosh Dastidar

Lecture 4: An FPTAS for Knapsack, and K-Center

CS4800: Algorithms & Data Jonathan Ullman

Design and Analysis of Algorithms

(Refer Slide Time: 0:21)

CS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003

Kolmogorov complexity

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

CSC 421: Algorithm Design & Analysis. Spring 2018

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

MVE165/MMG630, Applied Optimization Lecture 6 Integer linear programming: models and applications; complexity. Ann-Brith Strömberg

CS 350 Algorithms and Complexity

Greedy Algorithms and Data Compression. Curs 2018

Maximum sum contiguous subsequence Longest common subsequence Matrix chain multiplication All pair shortest path Kna. Dynamic Programming

Problem A. Gift. Input. Output. Examples. All-Ukrainian School Olympiad in Informatics Codeforces, April, 12, 2011

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

Lecture 2: Divide and conquer and Dynamic programming

Transcription:

NEW SOUTH WALES Algorithms: COMP3121/3821/9101/9801 Aleks Ignjatović School of Computer Science and Engineering University of New South Wales TOPIC 4: THE GREEDY METHOD COMP3121/3821/9101/9801 1 / 23

The above figure shows this does not work... (chosen activities in green, conflicting in red) COMP3121/3821/9101/9801 2 / 23 The Greedy Method Activity selection problem Instance: A list of activities a i, (1 i n) with starting times s i and finishing times f i. No two activities can take place simultaneously. Task: Find a maximum size subset of compatible activities. a2 an-1 a1 ai an si fi Attempt 1: always choose the shortest activity which does not conflict with the previously chosen activities, remove the conflicting activities and repeat?

Activity selection problem Instance: A list of activities a i, (1 i n) with starting times s i and finishing times f i. No two activities can take place simultaneously. Task: Find a maximum size subset of compatible activities. Attempt 2: Maybe among the activities which do not conflict with the previously chosen activities we should always choose the one with the earliest possible start? The above figure shows this does not work either... COMP3121/3821/9101/9801 3 / 23

Activity selection problem. Instance: A list of activities a i, (1 i n) with starting times s i and finishing times f i. No two activities can take place simultaneously. Task: Find a maximum size subset of compatible activities. Attempt 3: Maybe we should always choose an activity which conflicts with the fewest possible number of the remaining activities? It may appear that in this way we minimally restrict our next choice... As appealing the idea is, the above figure shows this again does not work... COMP3121/3821/9101/9801 4 / 23

Activity selection problem Instance: A list of activities a i, (1 i n) with starting times s i and finishing times f i. No two activities can take place simultaneously. Task: Find a maximum size subset of compatible activities. The correct solution: Among the activities which do not conflict with the previously chosen activities always chose the one with the earliest end time. COMP3121/3821/9101/9801 5 / 23

Activity selection problem COMP3121/3821/9101/9801 6 / 23

- proving optimality of a solution Transforming any optimal solution to the greedy solution with equal number of activities: find the first place where the chosen activity violates the greedy choice and show that replacing that activity with the greedy choice produces a non conflicting selection with the same number of activities. Continue in this manner till you morph your optimal solution into the greedy solution, thus proving the greedy solution is also optimal. COMP3121/3821/9101/9801 7 / 23

What is the time complexity of the algorithm? We make a copy of the list of all activities and sort the list in an increasing order of their finishing time; this takes O(n log n) time. We go through such a sorted list in order; for each activity we encounter we use the original list to see if this activity has a starting time which is after the finishing time of the last taken activity. Every activity is handled only once, so this part of the algorithm takes O(n) time. Thus, the algorithm runs in total time O(n log n). COMP3121/3821/9101/9801 8 / 23

Activity selection problem II Instance: A list of activities a i, (1 i n) with starting times s i and finishing times f i = s i + d; thus, all activities are of the same duration. No two activities can take place simultaneously. Task: Find a subset of compatible activities of maximal total duration. Solution: Since all activities are of the same duration, this is equivalent to finding a selection with a largest number of non conflicting activities, i.e., the previous problem. Question: What happens if the activities are not all of the same duration? Greedy strategy no longer works - we will need a more sophisticated technique. COMP3121/3821/9101/9801 9 / 23

Minimising job lateness Instance: A start time T 0 and a list of jobs a i, (1 i n), with duration times t i and deadlines d i. Only one job can be performed at any time; all jobs have to be completed. If a job a i is completed at a finishing time f i > d i then we say that it has incurred lateness l i = f i d i. Task: Schedule all the jobs so that the lateness of the job with the largest lateness is minimised. Solution: Ignore job durations and schedule jobs in the increasing order of deadlines. Optimality: Consider any optimal solution. We say that jobs a i and jobs a j form an inversion if job a i is scheduled before job a j but d j < d i. T0# fi(1# fi# fj(1# fj# dj# di# COMP3121/3821/9101/9801 10 / 23

Minimising job lateness We will show that there exists a scheduling without inversions which is also optimal. Recall the Bubble Sort: if we manage to eliminate all inversions between adjacent jobs, eventually all the inversions will be eliminated. T0% fi)1% fi% fi+1% di+1% di% di% fi)1% fi+1% fi% di+1% Note that swapping adjacent inverted jobs reduces the larger lateness COMP3121/3821/9101/9801 11 / 23

Given two sequences of letters A and B, find if B is a subsequence of A in the sense that one can delete some letters from A and obtain the sequence B. There is a line of 111 stalls, some of which need to be covered with boards. You can use up to 11 boards, each of which may cover any number of consecutive stalls. Cover all the necessary stalls, while covering as few total stalls as possible. COMP3121/3821/9101/9801 12 / 23

Tape storage Instance: A list of n files f i of lengths l i which have to be stored on a tape. Each file is equally likely to be needed. To retrieve a file, one must start from the beginning of the tape and scan it until the tape is found and read. Task: Order the files on the tape so that the average (expected) retrieval time is minimised. Solution: If the files are stored in order l 1, l 2,... l n, then the expected time is proportional to l 1 + (l 1 + l 2) + (l 1 + l 2 + l 3) +... + (l 1 + l 2 + l 3 +... + l n) = nl 1 + (n 1)l 2 + (n 2)l 3 +... + 2l n 1 + l n This is minimised if l 1 l 2 l 3... l n. COMP3121/3821/9101/9801 13 / 23

Tape storrage II Instance: A list of n files f i of lengths l i and probabilities to be needed p i, n i=1 pi = 1, which have to be stored on a tape. To retrieve a file, one must start from the beginning of the tape and scan it until the tape is fund and read. Task: Order the files on the tape so that the expected retrieval time is minimised. Solution: If the files are stored in order l 1, l 2,... l n, then the expected time is proportional to p 1l 1 + (l 1 + l 2)p 2 + (l 1 + l 2 + l 3)p 3 +... + (l 1 + l 2 + l 3 +... + l n)p n We now show that this is minimised if the files are ordered in a decreasing order of values of the ratio p i/l i. COMP3121/3821/9101/9801 14 / 23

Let us see what happens if we swap to adjacent files f k and f k+1. The expected time before the swap and after the swap are, respectively, E =l 1p 1 + (l 1 + l 2)p 2 + (l 1 + l 2 + l 3)p 3 + (l 1 + l 2 + l 3 +... + l k 1 + l k )p k +... + (l 1 + l 2 + l 3 +... + l k 1 + l k + l k+1 )p k+1 +... + (l 1 + l 2 + l 3 +... + l n)p n and E =l 1p 1 + (l 1 + l 2)p 2 + (l 1 + l 2 + l 3)p 3 + (l 1 + l 2 + l 3 +... + l k 1 + l k+1 )p k+1 +... + (l 1 + l 2 + l 3 +... + l k 1 + l k+1 + l k )p k +... + (l 1 + l 2 + l 3 +... + l n)p n Thus, E E = l k p k+1 l k+1 p k, which is positive just in case l k p k+1 > l k+1 p k, i.e., if p k /l k < p k+1 /l k+1. Consequently, E > E if and only if p k /l k < p k+1 /l k+1, which means that the swap decreases the expected time just in case p k /l k < p k+1 /l k+1, i.e., if there is an inversion: a file f i+1 with a larger ratio p k+1 /l k+1 has been put after a file f i with a smaller ratio p k /l k. For as long as there are inversions there will be inversions of consecutive files and swapping will reduce the expected time. Consequently, the optimal solution is the one with no inversions. COMP3121/3821/9101/9801 15 / 23

Let X be a set of n intervals on the real line. We say that a set P of points stabs X if every interval in X contains at least one point in P ; see the figure below. Describe and analyse an efficient algorithm to compute the smallest set of points that stabs X. Assume that your input consists of two arrays X L [1..n] and X R [1..n], representing the left and right endpoints of the intervals in X. COMP3121/3821/9101/9801 16 / 23

Assume you are given n sorted arrays of different sizes. You are allowed to merge any two arrays into a single new sorted array and proceed in this manner until only one array is left. Design an algorithm that achieves this task and uses minimal total number of moves of elements of the arrays. Give an informal justification why your algorithm is optimal. COMP3121/3821/9101/9801 17 / 23

Along the long, straight road from Loololong to Goolagong houses are scattered quite sparsely, sometimes with long gaps between two consecutive houses. Telstra must provide mobile phone service to people who live alongside the road, and the range of Telstras cell base station is 5km. Design an algorithm for placing the minimal number of base stations alongside the road, that is sufficient to cover all houses. COMP3121/3821/9101/9801 18 / 23

Using the Greedy Method to derive various properties of the object constructed Once again, along the long, straight road from Loololong (on the West) to Goolagong (on the East) houses are scattered quite sparsely, sometimes with long gaps between two consecutive houses. Telstra must provide mobile phone service to people who live alongside the road, and the range of Telstras cell base station is 5km. One of Telstra s engineer started with the house closest to Loololong and put a tower 5km away to the East. He then found the westmost house not already in the range of the tower and placed another tower 5 km to the East of it and continued in this way till he reached Goolagong. His junior associate did exactly the same but starting from the East and moving westwards and claimed that his method required fewer towers. Is there a placement of houses for which the associate is right? COMP3121/3821/9101/9801 19 / 23

Assume you have $2, $1, 50c, 20c, 10c and 5c coins to pay for your lunch. Design an algorithm that, given the amount that is a multiple of 5c, pays it with a minimal number of coins. Assume denominations of your n + 1 coins are 1, c, c 2, c 3,..., c n for some integer c > 1. Design a greedy algorithm which, given any amount, pays it with a minimal number of coins. Give an example of a set of denominations containing the single cent coin for which the greedy algorithm does not always produce an optimal solution. COMP3121/3821/9101/9801 20 / 23

- Huffman code Assume you are given a set of symbols, for example the English alphabet plus punctuation marks and a blank space (to be used between words). You want to encode these symbols using binary strings, so that sequences of such symbols can be decoded in an unambiguous way. One way of doing so is to reserve bit strings of equal and sufficient length, given the number of distinct symbols to be encoded; say if you have 26 letters plus 6 punctuation symbols, you would need strings of length 5 (2 5 = 32). To decode a piece of text you would partition the bit stream into groups of 5 bits and use a lookup table to decode the text. However this is not an economical way: all the symbols have codes of equal length but the symbols are not equally frequent. One would prefer an encoding in which frequent symbols such as a, e, i or t have short codes while infrequent ones, such as w, x and y can have longer codes. However, if the codes are of variable length, then how can we partition a bitstream UNIQUELY into segments each corresponding to a code? One way of insuring unique readability of codes from a single bitstream is to ensure that no code of a symbol is a prefix of a code for another symbol. Codes with such property are called the prefix codes. COMP3121/3821/9101/9801 21 / 23

- Huffman code We can now formulate the problem: Given the frequencies (probabilities of occurrences) of each symbol, design an optimal prefix code, i.e., a prefix code such that the expected length of an encoded text is as small as possible. Note that this amounts to saying that the average number of bits per symbol in an average text is as small as possible. COMP3121/3821/9101/9801 22 / 23

0-1 knapsack problem Instance: A list of weights w i and values v i for discrete items a i, 1 i n, and a maximal weight limit W of your knapsack. Task: Find a subset S of all items available such that its weight does not exceed W and its value is maximal. Can we always choose the item with the highest value per unit weight? Assume there are just three items with weights and values: (10kg, $60), (20kg, $100), (30kg, $120) and a knapsack of capacity W = 50kg. Greedy would choose (10kg, $60) and (20kg, $100), while the optimal solution is to take (20kg, $100) and (30kg, $120) So when does the Greedy Strategy work?? Unfortunately there is no easy rule... COMP3121/3821/9101/9801 23 / 23