CS 161: Design and Analysis of Algorithms

Similar documents
Breadth First Search, Dijkstra s Algorithm for Shortest Paths

CS 4407 Algorithms Lecture: Shortest Path Algorithms

Dynamic Programming: Shortest Paths and DFA to Reg Exps

Dijkstra s Single Source Shortest Path Algorithm. Andreas Klappenecker

Dynamic Programming: Shortest Paths and DFA to Reg Exps

Algorithm Design and Analysis

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Fall 2017 November 10, Written Homework 5

CS 161: Design and Analysis of Algorithms

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Shortest Path Algorithms

Algorithms Theory. 08 Fibonacci Heaps

Announcements. CompSci 102 Discrete Math for Computer Science. Chap. 3.1 Algorithms. Specifying Algorithms

Single Source Shortest Paths

CSCE 750 Final Exam Answer Key Wednesday December 7, 2005

CS325: Analysis of Algorithms, Fall Final Exam

Single Source Shortest Paths

Discrete Wiskunde II. Lecture 5: Shortest Paths & Spanning Trees

Design and Analysis of Algorithms

2. (25%) Suppose you have an array of 1234 records in which only a few are out of order and they are not very far from their correct positions. Which

CS781 Lecture 3 January 27, 2011

Scribes: Po-Hsuan Wei, William Kuzmaul Editor: Kevin Wu Date: October 18, 2016

ECOM Discrete Mathematics

CMPS 6610 Fall 2018 Shortest Paths Carola Wenk

CS481: Bioinformatics Algorithms

Discrete Optimization 2010 Lecture 2 Matroids & Shortest Paths

Analysis of Algorithms. Outline. Single Source Shortest Path. Andres Mendez-Vazquez. November 9, Notes. Notes

25. Minimum Spanning Trees

25. Minimum Spanning Trees

CMPS 2200 Fall Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk. 10/8/12 CMPS 2200 Intro.

CS361 Homework #3 Solutions

MAKING A BINARY HEAP

Assignment 4. CSci 3110: Introduction to Algorithms. Sample Solutions

A faster algorithm for the single source shortest path problem with few distinct positive lengths

CS 580: Algorithm Design and Analysis

Fibonacci Heaps These lecture slides are adapted from CLRS, Chapter 20.

Algorithm Design and Analysis

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

Lecture 2: Divide and conquer and Dynamic programming

L feb abhi shelat

CS 253: Algorithms. Chapter 24. Shortest Paths. Credit: Dr. George Bebis

BFS Dijkstra. Oct abhi shelat

Solutions to the Midterm Practice Problems

CPSC 320 Sample Final Examination December 2013

8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm

CS 6301 PROGRAMMING AND DATA STRUCTURE II Dept of CSE/IT UNIT V GRAPHS

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

CS 580: Algorithm Design and Analysis

Introduction to Algorithms

Greedy Alg: Huffman abhi shelat

Introduction to Algorithms

Analysis of Algorithms. Outline 1 Introduction Basic Definitions Ordered Trees. Fibonacci Heaps. Andres Mendez-Vazquez. October 29, Notes.

CS 395T Computational Learning Theory. Scribe: Mike Halcrow. x 4. x 2. x 6

CS540 ANSWER SHEET

Min/Max-Poly Weighting Schemes and the NL vs UL Problem

Part IA Algorithms Notes

List-coloring the Square of a Subcubic Graph

Algorithms: Lecture 12. Chalmers University of Technology

Exam EDAF May 2011, , Vic1. Thore Husfeldt

IS 709/809: Computational Methods in IS Research Fall Exam Review

Introduction to Algorithms

Generating Functions (Revised Edition)

CSE 421 Dynamic Programming

Query Processing in Spatial Network Databases

Introduction to Divide and Conquer

Introduction to Algorithms

At the start of the term, we saw the following formula for computing the sum of the first n integers:

Breadth-First Search of Graphs

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Question 1. Find the coordinates of the y-intercept for. f) None of the above. Question 2. Find the slope of the line:

Fly Cheaply: On the Minimum Fuel Consumption Problem

N/4 + N/2 + N = 2N 2.

Slides credited from Hsueh-I Lu & Hsu-Chun Hsiao

the distance from the top to the bottom of an object height a measurement of how heavy something is weight

CS60007 Algorithm Design and Analysis 2018 Assignment 1

Assignment 5: Solutions

Problem One: Order Relations i. What three properties does a binary relation have to have to be a partial order?

Section Handout 5 Solutions

Exam EDAF August Thore Husfeldt

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

MAKING A BINARY HEAP

7.4 DO (uniqueness of minimum-cost spanning tree) Prove: if all edge weights are distinct then the minimum-cost spanning tree is unique.

Recursion and Induction

Calculating Frobenius Numbers with Boolean Toeplitz Matrix Multiplication

Lecture 6: September 22

Math 2 Variable Manipulation Part 1 Algebraic Equations

Space Complexity of Algorithms

More Asymptotic Analysis Spring 2018 Discussion 8: March 6, 2018

Maximum Flow. Jie Wang. University of Massachusetts Lowell Department of Computer Science. J. Wang (UMass Lowell) Maximum Flow 1 / 27

Approximation algorithms for cycle packing problems

40h + 15c = c = h

CS1800: Mathematical Induction. Professor Kevin Gold

NATIONAL UNIVERSITY OF SINGAPORE CS3230 DESIGN AND ANALYSIS OF ALGORITHMS SEMESTER II: Time Allowed 2 Hours

Putnam Greedy Algorithms Cody Johnson. Greedy Algorithms. May 30, 2016 Cody Johnson.

The Greedy Method. Design and analysis of algorithms Cs The Greedy Method

Algorithms Re-Exam TIN093/DIT600

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

Discrete Mathematics and Probability Theory Fall 2018 Alistair Sinclair and Yun Song Note 6

Lecture 4: An FPTAS for Knapsack, and K-Center

CSC236H Lecture 2. Ilir Dema. September 19, 2018

Transcription:

CS 161: Design and Analysis of Algorithms

Greedy Algorithms 1: Shortest Paths In Weighted Graphs Greedy Algorithm BFS as a greedy algorithm Shortest Paths in Weighted Graphs Dijsktra s Algorithm

Greedy Algorithms Build up solution, making most obvious decision at each step Example: Making change How can I make x cents using the fewest number of coins/bills?

Making Change General problem: Given integer coin values v 1, v n, and a target amount W, find a set of coins (allowing repetition) whose total value is w that minimizes the total number of coins Greedy Algorithm: If x = 0, give nothing and stop Otherwise, find the largest v i W, add v 1 to the set, and subtract v 1 from W. Repeat

Making Change For U.S. currency, we have v 1 = 1, v 2 = 5, v 3 = 10, v 4 = 25 (ignore higher values) Does greedy algorithm give best solution?

Greedy Algorithm Optimal? Claim: greedy is optimal for U.S. coins Equivalent to showing the following: For any amount W, there is an optimal solution using the largest v i that is at most W

Greedy Algorithm Optimal Proof of equivalence: If claim true, then greedy is optimal, and greedy uses largest v i that is at most W Other direction: Assume true for all W < W. Say greedy uses g coins, optimal uses p coins Let P be an optimal solution for W that uses largest v i Greedy first picks v i, then solves W- v i By induction, greedy optimal on W- v i, g-1 coins P without v i is a solution for W- v i with p-1 coins Therefore, p-1 g-1, so p g

Greedy Algorithm Optimal? Claim: greedy is optimal for U.S. coins Proof: Must have at most 2 dimes (otherwise can replace 3 dimes with quarter and nickel) If 2 dimes, no nickels (otherwise can replace 2 dimes and 1 nickel with a quarter) At most 1 nickel (otherwise can replace 2 nickels with a dime) At most 4 pennies(otherwise can replace 5 pennies with a nickel

Greedy Algorithm Optimal? In optimal solutions: Total value of pennies: < 5 cents Total value of pennies and nickels: < 10 cents Total value of non-quarters: < 25 cents Therefore we always use the largest coin, so greedy optimal

Making Change Is greedy algorithm optimal in general? Say we have 20 cent pieces as well What if w = 40? Optimal: two 20 cent pieces (2 coins) Greedy: first picks quarter, then dime, then nickel (3 coins) In general, greedy algorithm does not give optimal solution for the Making Change Problem!

Making Change What properties of coin values lets greedy be optimal? Let r t = Ceiling(v i+1 /v i ), s t = r t v t Theorem: If, for each t = 1,,n-1, greedy algorithm outputs fewer than r t coins for value W t = s t v t+1, then greedy always optimal

Example: U.S. Currency t 1 2 3 4 v t 1 5 10 25 r t 5 2 3 N/A s t 5 10 30 N/A W t 0 0 5 N/A Greedy(W t ) 0 0 1 N/A

Example: U.S. Currency with 20 Cent Coin t 1 2 3 4 5 v t 1 5 10 20 25 r t 5 2 2 2 N/A s t 5 10 20 40 N/A W t 0 0 0 15 N/A Greedy(W t ) 0 0 0 2 N/A

Greedy Algorithms General Goal: give a simple greedy algorithm, prove that it gives optimal solution Proving optimality is usually the hard part

BFS as Greedy Algorithm Problem: Given a source node v, compute the distances to nodes in graph Approach: Set all distances to, update greedily Set distance(u) = for u v Set distance(v) = 0 Repeatedly process nodes: Find closest node u that hasn t been processed For each edge (u,w), update distance to w

BFS as Greedy Algorithm When at a node u, to update distance to w: distance(w) = Min(distance(w),distance(u) + 1) Find closest unprocessed node by keeping queue queue contains unprocessed nodes that have distance < Works because we always add nodes to queue in in order of increasing distance, distances never updated once in queue

Why Greedy? We update neighbors of closest nodes first

Why Correct? Let d(u) be correct distance to u Claim 1: All nodes have distance(u) d(u)

Proof True at beginning. Inductively assume true for first i- 1 updates Let distance i (u) be distance at step i For step i, distance i (w) = Min(distance i-1 (w),distance i-1 (u) + 1) By induction, d(w) distance i-1 (w) and d(u) distance i-1 (u) d(w) d(u) + 1 distance i-1 (u)+1 Therefore, d(w) distance i (w)

Why Correct? Let d(u) be correct distance to u Claim 1: All nodes have distance(u) d(u) Claim 2: After processing a node w, all processed nodes u have distance(u)=d(u) d(w), all unprocessed nodes u have d(u ) d(w), and for all u, distance(u) is the length of the shortest path from v to u where intermediate nodes are constrained to be processed

Proof True at beginning. Inductively assume true for the first i-1 nodes we process. Now process w If distance(w) > d(w), then the shortest path to w goes contains nodes that haven t been processed Let u be the first such node All on shortest path to u have been processed distance(u) = d(u) d(w) < distance(w) Therefore, u would have been processed instead of w Would have set distance(w) d(u)+1=d(w)

Proof Say we are processing w and update w Let dist be length of shortest path to w through processed nodes, u be last node on this path If w is on path, u = w (otherwise, since d(u) d(w), we can bypass bypass w without increasing distance) distance(w ) = distance(w)+1 If w is not on path, that distance(w ) was already set correctly Thus distance(w ) = Min of these two cases

Why Correct? Let d(u) be correct distance to u Claim 1: All nodes have distance(u) d(u) Claim 2: After processing a node w, all processed nodes u have distance(u)=d(u) d(w), all unprocessed nodes u have d(u ) d(w), and for all u, distance(u) is the length of the shortest path from v to u where intermediate nodes are constrained to be processed Therefore, once all nodes are processed, distances correct

Weighted Edges 2 7 1 1 v 4 6 2 2 3 3 1 2

Weighted Edges Length of path = sum of weights of edges Shortest path = path with shortest length Distance from v to u = length of shortest path from v to u

Shortest Path with Positive Integer Weights Idea: replace edge of length k with k-1 nodes and k unweighted edges 2 7 1 1 v 4 6 2 2 3 3 1 2

Shortest Path with Positive Integer Weights Idea: replace edge of length k with k-1 nodes and k unweighted edges v

Running Time? Let W be total weight V : V + extra nodes Edge of weight w gets w-1 extra nodes Sum over all edges: W- E V = V - E +W E = W O( V + E ) = O( V +W)

Problem Can t handle non-integer weights W can be very large poor performance

Solution: Dijsktra s Algorithm Recall BFS high level idea: Set distance(u) = for u v Set distance(v) = 0 Repeatedly process nodes: Find closest node u that hasn t been processed For each edge (u,w), update distance to w Works for general (non-negative) edge weights! update sets distance(w)=min(distance(w),distance(u)+weight(u,w)) Issue: now distances might be updated multiple times, so can t use queue

Why Correct? Let d(u) be correct distance to u Claim 1: All nodes have distance(u) d(u)

Proof True at beginning. Inductively assume true for first i- 1 updates Let distance i (u) be distance at step i For step i, distance i (w) = Min(distance i-1 (w),distance i-1 (u) + 1) By induction, d(w) distance i-1 (w) and d(u) distance i-1 (u) d(w) d(u) + weight(u,w) distance i-1 (u)+weight(u,w) Therefore, d(w) distance i (w)

Why Correct? Let d(u) be correct distance to u Claim 1: All nodes have distance(u) d(u) Claim 2: After processing a node w, all processed nodes u have distance(u)=d(u) d(w), all unprocessed nodes u have d(u ) d(w), and for all u, distance(u) is the length of the shortest path from v to u where intermediate nodes are constrained to be processed

Proof True at beginning. Inductively assume true for the first i-1 nodes we process. Now process w If distance(w) > d(w), then the shortest path to w goes contains nodes that haven t been processed Let u be the first such node All on shortest path to u have been processed distance(u) = d(u) d(w) < distance(w) Therefore, u would have been processed instead of w Would have set distance(w) d(u)+weight(u,w)=d(w)

Proof Say we are processing w and update w Let dist be length of shortest path to w through processed nodes, u be last node on this path If w is on path, u = w (otherwise, since d(u) d(w), we can bypass bypass w without increasing distance) distance(w ) = distance(w)+weight(w,w ) If w is not on path, that distance(w ) was already set correctly Thus distance(w ) = Min of these two cases

Why Correct? Let d(u) be correct distance to u Claim 1: All nodes have distance(u) d(u) Claim 2: After processing a node w, all processed nodes u have distance(u)=d(u) d(w), all unprocessed nodes u have d(u ) d(w), and for all u, distance(u) is the length of the shortest path from v to u where intermediate nodes are constrained to be processed Therefore, once all nodes are processed, distances correct

How to Allow Multiple Updates Queue no longer works Instead, use a heap! Heap contains all unprocessed nodes Ordered by distance To find closest, deletemin() To update: distance(w) = Min(distance(w),distance(u)+weight(u,w)) decreasekey(w)

Solution: Dijsktra s Algorithm Set distance(v) = 0, distance(u) = for u v Create heap q with all nodes, ordered by distance While q is not empty: Let u = q.deletemin() For each edge (u,w) in E: distance(w) = Min(distance(w),distance(u)+weight(u,w)) decreasekey(w)

Running Time? V deletemin operations V + E insert/decreasekey operations ( V +2 E in undirected graphs) Recall heap operations: deletemin: O(log V ) insert/decreasekey: O(log V ) Total time: O( ( V + E )log V )

Potential Optimization: Lists Implement heap operations using a linked list Insert: add beginning of list O(1) decreasekey: do nothing O(1) deletemin: search whole list for minimum O( V ) Total time: O( V 2 ) Better for dense graphs E =O( V 2 ) Worse for sparse graphs

Best of Both Worlds: d-ary Heaps Recall d-ary heap operations: deletemin: O(d log V /log d) decreasekey: O(log V /log d) Total time: O( V d log V /log d +( V + E ) log V /log d) = O( (d V + E )log V /log d) What d minimizes this quantity?

Choosing d Minimize O( (d V + E )log V /log d) Difficult to minimize exactly Possible to show that d = O( E / V ) is optimal Some cases: Sparse: E =O( V ), d=o(1), time=o( V log V ) Same as with binary heaps Dense: E =O( V 2 ), d=o( V ), time=o( V 2 ) Same as with linked lists Intermediate: E =O( V 1+c ), d=o( V c ), time= O( V 1+c ) Linear, better than both binary heaps and linked lists

Even Better: Fibonacci Heap Complicated data structure deletemin: O(log V ) insert/decreasekey: O(1) amortized Total running time: O( E + V log V ) Only asymptotically better when E is close to V, but not O( V ) Example: E =O( V log V )

Finding Actual Path So far, we only compute distance from v to other nodes How do we compute the actual shortest path? If processing node u causes last change to distance(w), some shortest path to w passes through u Keep track of prev(w), the previous node in shortest path

Finding Actual Path Set distance(v) = 0, distance(u) = for u v Create heap q with all nodes, ordered by distance While q is not empty: Let u = q.deletemin() For each edge (u,w) in E: If(distance(u)+weight(u,w) < distance(w)): distance(w) = distance(w)+weight(u,w) prev(w) = u

Finding Actual Path To find path from v to w, follow prev pointers form w back to v

Negative Edges If we have negative edges, Dijsktra s algorithm fails V 2 A 1 C 3-2 B

Negative Edges If we have negative edges, Dijsktra s algorithm fails distance=0 V 2 A 2 C 3-2 B

Negative Edges If we have negative edges, Dijsktra s algorithm fails distance=0 V 2 distance=2 A 2 C 3-2 B distance=3

Negative Edges If we have negative edges, Dijsktra s algorithm fails distance=0 V 2 distance=2 A 2 distance=4 C 3-2 B distance=3

Negative Edges If we have negative edges, Dijsktra s algorithm fails distance=0 V 2 distance=1 A 2 distance=4 C 3-2 B distance=3

Negative Edges If we have negative edges, Dijsktra s algorithm fails distance=0 V 2 distance=1 A 2 distance=4 C 3-2 B distance=3

Negative Edges If we have negative edges, Dijsktra s algorithm fails distance=0 V 2 distance=1 A 2 distance=4 C 3-2 B distance=3 Distance from V to C is 3, not 4!

Why Fail on Negative Edges? Let d(u) be correct distance to u Claim 1: All nodes have distance(u) d(u) Claim 2: After processing a node w, all processed nodes u have distance(u)=d(u) d(w), all unprocessed nodes u have d(u ) d(w), and for all u, distance(u) is the length of the shortest path from v to u where intermediate nodes are constrained to be processed

Why Fail on Negative Edges True at beginning. Inductively assume true for the first i-1 nodes we process. Now process w If distance(w) > d(w), then the shortest path to w goes contains nodes that haven t been processed Let u be the first such node All on shortest path to u have been processed distance(u) = d(u) d(w) < distance(w) Therefore, u would have been processed instead of w Would have set distance(w) d(u)+weight(u,w)=d(w)

Bigger Problem: Negative Cycles In the presence of a negative cycle, the shortest path problem is undefined V 1 1 A -2 1 B C Path VAC has length 2 Path VABAC has length 1 Path VABABAC has length 0 Path VABABABAC has length -1

Negative Cycles If negative cycles, no shortest path Possible solution: add constant value to every edge Does not compute shortest paths! Possible solution: shortest simple path Finite number of simple paths, so shortest exists Turns out to be very hard in presence of negative cycles