Matroids Shortest Paths Discrete Optimization 2010 Lecture 2 Matroids & Shortest Paths Marc Uetz University of Twente m.uetz@utwente.nl Lecture 2: sheet 1 / 25 Marc Uetz Discrete Optimization
Matroids Shortest Paths Prim s Algoritm (1957) Algorithm 1: Prim input : G = (V, E, c) output: T E, minimum spanning tree of G let W = v 0 arbitrary start vertex; T = ; for (i = 1,..., n 1) do e i = {v, u} := argmin{c e e δ(w )}; // e i is cheapest edge leaving W ; v W and u W T = T e i ; W = W u Correctness follows directly from Cut Condition Computation time can be shown in O( n 2 ) Lecture 2: sheet 2 / 25 Marc Uetz Discrete Optimization
Matroids Shortest Paths More remarks about MST Read literature With a little more care (data structure & analysis), Kruskal s algorithm can be implemented to run in O( m log m ) Krukal s algorithm: Grows a forest (collection of several trees) on V Prim s algorithm: Grows one (minimal) spanning tree only, on subset W of nodes of V Take away message A good bound on computation time may depend on clever data structures (& tricks); just some (polynomial) bound is often easy Lecture 2: sheet 3 / 25 Marc Uetz Discrete Optimization
Outline Matroids Shortest Paths 1 Matroids 2 Shortest Paths General Graphs Nonnegative Arc Lengths Lecture 2: sheet 4 / 25 Marc Uetz Discrete Optimization
Matroids Shortest Paths Independence Systems and Matroids Definition (Matroid) Let S = {1,..., n} be a finite set and I 2 S be a family of subsets of S (the independent sets). Then M = (S, I) is an independence system if: 1 I ( is independent) 2 If J I and I J then I I (I is closed for ) Moreover, M = (S, I) is a matroid if in addition: 3 For all A S, all maximal independent subsets of A have the same cardinality if elements s of S have weights w s, call it weighted matroid maximal independent subsets of A S are called bases of A rank r(a) = size of any basis of A (well-defined for matroids) Lecture 2: sheet 5 / 25 Marc Uetz Discrete Optimization
Matroids Shortest Paths The Greedy Algorithm Algorithm 2: Greedy input : weighted matroid M = (S, I, w), any A S output: minimum [maximum] weight basis F of A let F = ; // note I while ( s A such that F {s} I) do choose such s A with minimal [maximal] weight w s ; F F {s}; // greedily add s to F Note: For any particular matroid, need an oracle (an algorithm) that tells us if F {s} I (i.e., if F {s} is independent) Lecture 2: sheet 6 / 25 Marc Uetz Discrete Optimization
Matroids Shortest Paths Kruskal s Algorithm Revisited Kruskal s Algorithm Let T = (T is a forest) while ( edge e E \ T, s.t. T {e} is forest) pick such e with minimal cost c e T T {e} Observations The set of forests of a graph are a matroid, and the bases are the spanning trees (Exercise) Kruskal s algorithm is just the greedy algorithm applied to some particular matroid Lecture 2: sheet 7 / 25 Marc Uetz Discrete Optimization
Matroids Shortest Paths The Greedy Algorithm Theorem Greedy algorithm computes a min [max] weight basis of any A S. Proof F = output of Greedy, T = minimal weight base, note F = T. Let F = {f 1,..., f l }, T = {t 1,..., t l } both sorted by weights Let k be minimal with w fk > w tk Iteration k, why didn t Greedy pick any t i of t 1,..., t k? t i {f 1,..., f k 1 } {f 1,..., f k 1, t i } I In any case, we have that {f 1,..., f k 1 } is basis of {f 1,..., f k 1, t 1,..., t k } but {t 1,..., t k } I, contradiction Lecture 2: sheet 8 / 25 Marc Uetz Discrete Optimization
Matroids Shortest Paths Greedy Works only for Matroids Theorem Given an independence system M = (S, I), the greedy algorithm computes min [max] weight bases for all possible weights functions w s if and only if M is a matroid. Proof ( only if ) Assume two bases F, T of different cardinality, say F T + 1 Define weights 2 for all s S \ (F T ) 1 ε for all s F for some ε < 1/ S 1 for all s T \ F ( ) and Greedy must fail to compute optimum, which has weight T Lecture 2: sheet 9 / 25 Marc Uetz Discrete Optimization
Matroids Shortest Paths Greedy or Not Greedy Matroids forests of undirected graphs (your Exercise) linear matroids: linearly independent vectors of a vector field (Steinitz exchange lemma all bases have same length) Independence systems, but no matroids matchings in undirected graphs (but there exists a polytime algorithm - Edmonds 1965) independent sets in graph G = (V, E) (no polytime algorithm, unless P=NP) Lecture 2: sheet 10 / 25 Marc Uetz Discrete Optimization
Matroids Shortest Paths Side Remark: Polymatroids Let M = (S, I) be a matroid with rank function r Consider Linear Program maximize subject to w s x s s S x s r(a), s A x 0 A S For A S, let x A be the characteristic vector of A If T I, x T is feasible solution (obvious) If Greedy computes T as maximum weight base, x T is an optimal solution (Edmonds 70) Lecture 2: sheet 11 / 25 Marc Uetz Discrete Optimization
Outline 1 Matroids 2 Shortest Paths General Graphs Nonnegative Arc Lengths Lecture 2: sheet 12 / 25 Marc Uetz Discrete Optimization
Shortest Path Problem!!"#$%&'()*+,%+('-*.#/'01234564'.*,'7+"-%/8',&'5!"4'84%#3 Given! 9$%#$%& digraph! :/;*%+8%'#.%/'< G = (V, A), 84% =*;>'8'%;'%? integer arc lengths c : A Z, s, t V want:! :/;*%+8%'#.%/8'< shortest path 84@ &."('()8%.",+8'(2@6'=*;>'8'%;'.77';%/+*'@ length from s to t (to all nodes v V ) (&)!! $! " ) $ ) $ # # " # %&' (&)! " $ # %&' In undirected graphs: 2 = 2 2 Lecture 2: sheet 13 / 25 Marc Uetz Discrete Optimization
Three main cases Arbitrary digraphs includes negative cycle detection Label correcting (Bellman-Ford) algorithm & Floyd-Warshall O( nm ), respectively O( n 3 ) time Arbitrary digraphs, nonnegative arc lengths Label setting (Dijkstra) O( n 2 ) time Acyclic digraphs Dynamic programming O( n + m ) time Lecture 2: sheet 14 / 25 Marc Uetz Discrete Optimization
Shortest Path Optimality!"#$%&'%()*%"'(+,%-.*/-%0(1#23-%-#2 Let d(v), v V be arbitrary node labels such that d(v) shortest path length from s to v, v V!!"#$%&'(!"#$%&'%(%$)(*+,#-'. )*(&!"%(+#$! /+-(0123(12#43(5+(.67+(%$5"-$%$8(,60+(*%5+*..9)'(-'%-(0123(".(%,(9&&+$( 9&&+$( Theorem 569,0 :6$(-'+(.'6$-+.-(&%-'(:$67(.(-6(2(12#4;.3< Labels d(v) are shortest path lengths if and only if the! ='+(,60+(*%5+*.(0123(%$+(.'6$-+.-(&%-'(*+,#-'.(":(%,0(6,*8(": -inequality holds 01>3(! for all 01,3?) arcs (v, w) A,%- :6$(%**(12@>3#A d(w) d(v) + c vw * (&)!!! $ "! " ) $ ) $! # # " " # + 2 %&' ) 2> > Lecture 2: sheet 15 / 25 Marc Uetz Discrete Optimization
Proof Necessity trivial, for otherwise there is a shortcut to w via v Sufficiency Let l(w) be the (true) shortest path length to w, l(w) d(w). pick shortest (s, w)-path P = (s,... u, v, w), length l(w) use -inequality along this path d(w) d(v) + c vw d(u) + c uv + c vw. d(s) + length(p) = l(w) Lecture 2: sheet 16 / 25 Marc Uetz Discrete Optimization
Shortest Path Algorithms, Ideas Use previous theorem Start with distance labels d(v) shortest path lengths, then correct violated -inequalities (in a clever way) start with d(s) = 0 and d(v) = then, check (and correct) -inequality along arcs [that is, if (d(w) > d(v) + c vw ) then d(w) = d(v) + c vw and pred(w) = v] say, in rounds where we always check -inequality of arcs in order e 1, e 2,..., e m question: this stops? after how many rounds? Claim: (n 1) rounds suffice, if shortest path lengths exist Lecture 2: sheet 17 / 25 Marc Uetz Discrete Optimization
Proof of Claim note that shortest paths lengths exist G has no negative length cycle consider any shortest path (s, u, v,..., w) to any node w, consisting of k ( n 1) arcs after first round, d(u) correct after second round, d(v) correct... after kth ( n 1) round, d(w) is correct Lecture 2: sheet 18 / 25 Marc Uetz Discrete Optimization
Bellman-Ford Algorithm Algorithm one round: label correcting all arcs a A in some order if -inequality violated in nth round, return negative cycle Time complexity trivially, O( nm ) [n rounds of O( m )] Correctness If negative length cycle, see previous slide If (and only if) negative length cycle, some -inequality will still be violated in nth round (actually, in any round) Lecture 2: sheet 19 / 25 Marc Uetz Discrete Optimization
All-Pairs Shortest Paths Use n Bellman-Ford, all-pairs shortest paths in O( n 2 m ) O( n 4 ) Definition For all v, w V, define d u (v, w) as the length of a shortest (v, w)-path that may only pass through nodes 1, 2,..., u (Assume nodes are 1, 2,..., n, some order) have d 0 (v, w) = c v,w if (v, w) A, otherwise by definition, d n (v, w) is the shortest (v, w)-path length have the recurrence relation d u (v, w) = min{d u 1 (v, w), d u 1 (v, u) + d u 1 (u, w)} So, given d u 1, computation of d u takes n 2 O( 1 ) time Lecture 2: sheet 20 / 25 Marc Uetz Discrete Optimization
Floyd-Warshall Algorithm Algorithm 3: Floyd-Warshall input : G = (V, A, c) output: d(v, w), pred(v, w) for all v, w V or negative cycle d(v, w) = c vw and pred(v, w) = v (v, w) A; d(v, w) = and pred(v, w) = (v, w) A; d(v, v) = 0 v V ; for (u = 1,..., n) do for (all v, w V ) do if (d(v, w) > d(v, u) + d(u, w)) then d(v, w) = d(v, u) + d(u, w), pred(v, w) = pred(u, w); if ((v == w) and d(v, w) < 0) return negative cycle ; pred(v, w) = last node before w on (v, w)-path correctness follows by correctness of recurrence relation computation time, trivially n n 2 O( 1 ) = O( n 3 ) Lecture 2: sheet 21 / 25 Marc Uetz Discrete Optimization
Dijkstra s Algorithm (1959) Shortest (s, v) paths for digraphs G = (V, A, c) with c 0 idea: only one correcting of each -inequality Algorithm 4: Dijkstra input : G = (V, A, c) with c 0, start node s V output: d(v), pred(v) for all v V d(s) = 0 and d(v) = v V \ s; S = ; while (S V ) do pick v V \ S with smallest label d(v); S = S v; // d(v) permanent for (all w with (v, w) A) do if (d(w) > d(v) + c vw ) then d(w) = d(v) + c vw and pred(w) = v; Lecture 2: sheet 22 / 25 Marc Uetz Discrete Optimization
Example: Dijkstra note: set S already has the correct distance labels at any iteration! Lecture 2: sheet 23 / 25 Marc Uetz Discrete Optimization
Correctness We prove by induction on # iterations (ind. start is trivial) Claim 1: d(u) d(w) for u S, w S Iteration S S v: d(u) d(v) u S by hypothesis Next, d(v) d(w) w S by choice of v with min. d(v) Finally, d(u) d(v) d(v) + c vw = d new (w) for relabeled w S (last because c vw 0) Claim 2: labels within S are correct (check -inequality) Iteration S S v: By hypothesis, need only check arcs with v: By claim 1, d(u) d(v) d(v) + c vu for all u S so any u S needs no relabeling via v Also, d(v) d(u) + c uv at iteration u, and u wasn t relabeled since Upon termination S = V, so Dijkstra is correct Lecture 2: sheet 24 / 25 Marc Uetz Discrete Optimization
Computation Time Simple Initialization O( n ) n iterations of while-loop, in each need to find smallest label in V \ S, which is doable in O( n ) O( n 2 ) in total m relabeling steps of O( 1 ) O( m ) Which gives O( n ) + O( n 2 ) + O( m ) O( n 2 ) time Less simple An O( m log n ) implementation, which uses priority queue (also called heap) to manage finding minimal d(v) in O( 1 ) time. (log n comes from overhead in organization of the heap) Lecture 2: sheet 25 / 25 Marc Uetz Discrete Optimization