Dantzig s pivoting rule for shortest paths, deterministic MDPs, and minimum cost to time ratio cycles

Size: px

Start display at page:

Download "Dantzig s pivoting rule for shortest paths, deterministic MDPs, and minimum cost to time ratio cycles"

Ezra Briggs
5 years ago
Views:

1 Dantzig s pivoting rule for shortest paths, deterministic MDPs, and minimum cost to time ratio cycles Thomas Dueholm Hansen 1 Haim Kaplan Uri Zwick 1 Department of Management Science and Engineering, Stanford University, USA. School of Computer Science, Tel Aviv University, Israel. May 1, 014 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 1/1

2 The simplex algorithm, Dantzig (1947) max c T x s.t. Ax = b x 0 c Linear programming: Optimize a linear objective function subject to linear constraints. Vertices (or corners) are basic feasible solutions. The simplex algorithm: Move from vertex to vertex along edges while improving the objective. This operation is called pivoting. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles /1

3 Pivoting rules c Several improving pivots may be available for a given basic feasible solution. The edge is then chosen by a pivoting rule. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles /1

4 Pivoting rules c Several improving pivots may be available for a given basic feasible solution. The edge is then chosen by a pivoting rule. Dantzig s pivoting rule: Repeatedly use the improving pivot with most negative reduced cost. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles /1

5 Dantzig s pivoting rule Klee and Minty (197): Dantzig s pivoting rule may require exponentially many steps (the Klee-Minty cube 1 ). 1 Picture from Gärtner, Henk and Ziegler (1998) Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 4/1

6 Dantzig s pivoting rule Klee and Minty (197): Dantzig s pivoting rule may require exponentially many steps (the Klee-Minty cube 1 ). Although Dantzig s rule is exponential in the worst case, it is often efficient in practise. In this work we study Dantzig s rule when used to solve: Single source shortest paths Discounted deterministic Markov decision processes 1 Picture from Gärtner, Henk and Ziegler (1998) Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 4/1

7 Example: Single target shortest paths b 1 = 1 b = 1 b = 1 b t = t b 4 = 1 b 5 = 1 b 6 = 1 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 5/1

8 Example: Single target shortest paths b 1 = 1 b = 1 b = 1 b t = t b 4 = 1 b 5 = 1 b 6 = 1 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 5/1

9 Example: Single target shortest paths b 1 = 1 b = 1 b = 1 b t = t b 4 = 1 b 5 = 1 b 6 = 1 minimize c u,v x u,v s.t. v V : (u,v) E x v,w x u,v = b v w:(v,w) E u:(u,v) E (u, v) E : x u,v 0 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 5/1

10 Single target shortest paths The constraints ensure flow conservation. For a basic feasible solution, exactly one edge leaving every vertex has non-zero flow. There is a one-to-one correspondence between basic feasible solutions and shortest paths trees (or policies). Flow conservation: x 1 = 7 x = 0 x = 4 x 4 = x 1 + x = 1 + x + x 4 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 6/1

11 Single target shortest paths The constraints ensure flow conservation. For a basic feasible solution, exactly one edge leaving every vertex has non-zero flow. There is a one-to-one correspondence between basic feasible solutions and shortest paths trees (or policies). A pivot directs the flow along a different edge. An edge is an improving pivot (or improving switch) w.r.t. a policy iff it shortens the paths to the target. Flow conservation: x 1 = 7 x = 0 x = 4 x 4 = x 1 + x = 1 + x + x 4 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 6/1

12 Reduced costs c π 1, = = 1 8 t val π (1) = 1 val π () = 11 val π () = 8 For every policy π (shortest paths tree), let valπ(v) be the length of the path from v to t in π: (u, v) π : valπ(u) = c u,v + valπ(v) Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 7/1

13 Reduced costs c π 1, = = 1 8 val π (1) = 1 val π () = 11 val π () = 8 t For every policy π (shortest paths tree), let valπ(v) be the length of the path from v to t in π: (u, v) π : valπ(u) = c u,v + valπ(v) The reduced cost of an edge (u, v) w.r.t. π is: c π u,v := c u,v + valπ(v) valπ(u) Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 7/1

14 Reduced costs c π 1, = = 1 8 val π (1) = 1 val π () = 11 val π () = 8 t For every policy π (shortest paths tree), let valπ(v) be the length of the path from v to t in π: (u, v) π : valπ(u) = c u,v + valπ(v) The reduced cost of an edge (u, v) w.r.t. π is: c π u,v := c u,v + valπ(v) valπ(u) (u, v) is an improving switch w.r.t. π iff c π u,v < 0. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 7/1

15 Deterministic Markov decision processes (DMDPs) No target. Instead, we generate an infinite path and want to minimize the observed costs. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 8/1

16 Deterministic Markov decision processes (DMDPs) No target. Instead, we generate an infinite path and want to minimize the observed costs. Observed costs: Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 8/1

17 Deterministic Markov decision processes (DMDPs) No target. Instead, we generate an infinite path and want to minimize the observed costs. Observed costs: 5 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 8/1

18 Deterministic Markov decision processes (DMDPs) No target. Instead, we generate an infinite path and want to minimize the observed costs. Observed costs: 5, 4 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 8/1

19 Deterministic Markov decision processes (DMDPs) No target. Instead, we generate an infinite path and want to minimize the observed costs. Observed costs: 5, 4, Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 8/1

20 Deterministic Markov decision processes (DMDPs) No target. Instead, we generate an infinite path and want to minimize the observed costs. Observed costs: 5, 4,, 8 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 8/1

21 Deterministic Markov decision processes (DMDPs) No target. Instead, we generate an infinite path and want to minimize the observed costs. Observed costs: 5, 4,, 8, 7 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 8/1

22 Deterministic Markov decision processes (DMDPs) No target. Instead, we generate an infinite path and want to minimize the observed costs. Observed costs: 5, 4,, 8, 7, 8 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 8/1

23 Deterministic Markov decision processes (DMDPs) No target. Instead, we generate an infinite path and want to minimize the observed costs. Observed costs: 5, 4,, 8, 7, 8, 7 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 8/1

24 Deterministic Markov decision processes (DMDPs) No target. Instead, we generate an infinite path and want to minimize the observed costs. Observed costs: 5, 4,, 8, 7, 8, 7, 8, 7, 8, 7,... Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 8/1

25 Deterministic Markov decision processes (DMDPs) The sum of the costs may diverge to + or. Instead we minimize the discounted sum of costs, using some discount factor γ < 1. We can also use varying discounts, in which case every edge (u, v) has its own discount factor γ u,v. Observed costs: c 0, c 1, c, c, c 4,... Discounted sum: c 0 + γc 1 + γ c + γ c +... = γ k c k Varying discounts: c 0 + γ 0 c 1 + γ 0 γ 1 c + γ 0 γ 1 γ c +... = k 1 k=0 j=0 γ j k=0 c k Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 9/1

26 Related work and results The simplex algorithm with Dantzig s rule is a natural algorithm for solving the single source shortest paths (SSSP) problem, however, its complexity is not well understood. Orlin (1985): O(mn log n) pivots for SSSP with n vertices and m edges. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 10/1

27 Related work and results The simplex algorithm with Dantzig s rule is a natural algorithm for solving the single source shortest paths (SSSP) problem, however, its complexity is not well understood. Orlin (1985): O(mn log n) pivots for SSSP with n vertices and m edges. The same bound can be obtained with the analysis of Post and Ye (01). Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 10/1

28 Related work and results The simplex algorithm with Dantzig s rule is a natural algorithm for solving the single source shortest paths (SSSP) problem, however, its complexity is not well understood. Orlin (1985): O(mn log n) pivots for SSSP with n vertices and m edges. The same bound can be obtained with the analysis of Post and Ye (01). We show: O(mn log n) upper bound, and Ω(n ) lower bound, even for graphs with m = Θ(n). Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 10/1

29 Related work and results The simplex algorithm with Dantzig s rule is a natural algorithm for solving the single source shortest paths (SSSP) problem, however, its complexity is not well understood. Orlin (1985): O(mn log n) pivots for SSSP with n vertices and m edges. The same bound can be obtained with the analysis of Post and Ye (01). We show: O(mn log n) upper bound, and Ω(n ) lower bound, even for graphs with m = Θ(n). Every iteration uses O(m) time, so these bounds can not compete with the O(mn) Bellman-Ford algorithm. However, Dantzig s rule is a much more general algorithm. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 10/1

30 Related work and results Bounds for the number of pivots performed by the simplex algorithm with Dantzig s rule when applied to deterministic Markov decision processes (MDPs) with n vertices (states) and m edges (actions): Post and Ye (01): O(m n log n) for uniform discounts. O(m n 5 log n) for varying discounts. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 11/1

31 Related work and results Bounds for the number of pivots performed by the simplex algorithm with Dantzig s rule when applied to deterministic Markov decision processes (MDPs) with n vertices (states) and m edges (actions): Post and Ye (01): O(m n log n) for uniform discounts. O(m n 5 log n) for varying discounts. We show: O(m n log n) for uniform discounts. O(m n 4 log n) for varying discounts, assuming that all discounts are at least 1 1/Ω(n ). Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 11/1

32 Related work and results Bounds for the number of pivots performed by the simplex algorithm with Dantzig s rule when applied to deterministic Markov decision processes (MDPs) with n vertices (states) and m edges (actions): Post and Ye (01): O(m n log n) for uniform discounts. O(m n 5 log n) for varying discounts. We show: O(m n log n) for uniform discounts. O(m n 4 log n) for varying discounts, assuming that all discounts are at least 1 1/Ω(n ). Scherrer (01) generalized the result by Post and Ye (01) by identifying the properties that were needed for the proof to work. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 11/1

33 Related work and results We also show that deterministic MDPs with varying discounts (tending to 1) can model the minimum cost to time ratio cycle problem. The O(m n 4 log n) strongly polynomial bound for Dantzig s rule also applies to this setting. The only other known strongly polynomial algorithm runs in time Õ(n ) and uses Megiddo s parametric search technique (198). Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 1/1

34 Minimum cost-to-time ratio cycles c 1, t 1 c, t c, t c 7, t 7 c 5, t 5 c 9, t 9 c 4, t 4 c 6, t 6 c 1, t 1 c 8, t 8 c 14, t 14 c 11, t 11 c 1, t 1 c 10, t 10 Find the cycle C that minimizes the cost-to-time ratio, ( (u,v) C c u,v )/( (u,v) C t u,v ). When t u,v = 1 for all edges (u, v) we are looking for the minimum mean cost cycle. This problem is, for instance, solved as a subroutine in the min-cost flow algorithm of Goldberg and Tarjan (1989). Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 1/1

35 Important observation Dantzig s rule is oblivious to a potential transformation: Let p v be the potential of vertex v, and define new costs by: (u, v) E : c u,v := c u,v + p v p u Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 14/1

36 Important observation Dantzig s rule is oblivious to a potential transformation: Let p v be the potential of vertex v, and define new costs by: (u, v) E : c u,v := c u,v + p v p u The length of any path v 0, v 1, v,..., v k is changed by (p v1 p v0 ) + (p v p v1 ) + + (p vk p vk 1 ) = p vk p v0 Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 14/1

37 Important observation Dantzig s rule is oblivious to a potential transformation: Let p v be the potential of vertex v, and define new costs by: (u, v) E : c u,v := c u,v + p v p u The length of any path v 0, v 1, v,..., v k is changed by (p v1 p v0 ) + (p v p v1 ) + + (p vk p vk 1 ) = p vk p v0 The reduced costs remain the same after the transformation: The reduced cost c π u,v := c u,v + valπ(v) valπ(u) is the difference in length between two paths both starting at u and ending at t. The lengths of the two paths are changed by the same amount, and hence the difference remains the same. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 14/1

38 Simplifying assumptions For the analysis, we may transform the costs using the values of any policy π as potentials: (u, v) E : c u,v := c u,v + val π (v) val π (u) The transformed costs are exactly the reduced costs of π. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 15/1

39 Simplifying assumptions For the analysis, we may transform the costs using the values of any policy π as potentials: (u, v) E : c u,v := c u,v + val π (v) val π (u) The transformed costs are exactly the reduced costs of π. Assumption 1: Every edge (u, v) π has reduced cost 0 w.r.t. π. Hence, every vertex has an outgoing zero-cost edge, and these edges form a tree leading to the target t. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 15/1

40 Simplifying assumptions For the analysis, we may transform the costs using the values of any policy π as potentials: (u, v) E : c u,v := c u,v + val π (v) val π (u) The transformed costs are exactly the reduced costs of π. Assumption 1: Every edge (u, v) π has reduced cost 0 w.r.t. π. Hence, every vertex has an outgoing zero-cost edge, and these edges form a tree leading to the target t. Assumption : If we use the final policy π generated by Dantzig s rule, then all its values are 0. Since the values decrease with every iteration, we may assume that all values are non-negative. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 15/1

41 Post and Ye (01) Lemma (Post and Ye (01)) Every O(n log n) iterations an edge is eliminated such that it does not appear in later policies. Theorem (Orlin (1985), Post and Ye (01)) Dantzig s rule terminates after at most O(mn log n) iterations for single source shortest paths. The eliminated edge is the edge with most positive (transformed) cost. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 16/1

42 Post and Ye (01) Lemma (Post and Ye (01)) Every O(n log n) iterations an edge is eliminated such that it does not appear in later policies. Theorem (Orlin (1985), Post and Ye (01)) Dantzig s rule terminates after at most O(mn log n) iterations for single source shortest paths. The eliminated edge is the edge with most positive (transformed) cost. Elimination criterion: Since all values are non-negative, an edge (u, v) is eliminated once the value of u becomes sufficiently small: (u, v) π j : val πj (u) = c u,v + val πj (v) c u,v. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 16/1

43 Convergence Lemma Under assumptions 1 and, suppose π i+1 is obtained from π i by performing the improving switch with most negative reduced cost. Then: ( val πi+1 (v) 1 1 ) n val πi (v) v V v V A corresponding lemma was shown by Orlin (1985) and by Post and Ye (01). Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 17/1

44 Convergence Lemma Under assumptions 1 and, suppose π i+1 is obtained from π i by performing the improving switch with most negative reduced cost. Then: ( val πi+1 (v) 1 1 ) n val πi (v) v V v V A corresponding lemma was shown by Orlin (1985) and by Post and Ye (01). Post and Ye (01) use the lemma to bound the number of iterations until a single edge is eliminated. We create a tradeoff: Either multiple edges are eliminated, or the convergence is faster. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 17/1

45 The benefit of few edges with large cost t 0 0 c 0 0 c A policy π is a tree rooted at the target t. If the cost of an edge (u, v) π is almost zero then valπ(u) valπ(v). Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 18/1

46 Stronger lemma Lemma Under assumptions 1 and, suppose π i+1 is obtained from π i by performing the improving switch with most negative reduced cost. Assume that the vertices can be partitioned into k sets, such that all vertices in the same set have almost the same value. Then: ( val πi+1 (v) 1 1 ) val πi (v) kn v V v V We use this lemma to show that every O(kn log n) iterations, k edges are eliminated. Thus, the total number of iterations is at most O(mn log n). Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 19/1

47 Stronger lemma Lemma Under assumptions 1 and, suppose π i+1 is obtained from π i by performing the improving switch with most negative reduced cost. Assume that the vertices can be partitioned into k sets, such that all vertices in the same set have almost the same value. Then: ( val πi+1 (v) 1 1 ) val πi (v) kn v V v V We use this lemma to show that every O(kn log n) iterations, k edges are eliminated. Thus, the total number of iterations is at most O(mn log n). Note: The number of large-cost edges in the current policy varies. The analysis is restarted when this number doubles. Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 19/1

48 Open problems Close the gap between the O(mn log n) and Ω(n ) bounds for the number of pivots performed by Dantzig s rule for single source shortest paths. Improve the O(m n log n) and O(m n 4 log n) bounds for Dantzig s rule for deterministic MDPs with uniform and varying discounts, respectively. Prove a strongly polynomial bound for Howard s algorithm for deterministic MDPs? This algorithm simultaneously performs the improving switch with most negative reduced cost at every vertex. Hansen and Zwick (010) conjectured that the number of iterations should be at most m. Can the minimum cost to time ratio cycle problem be solved in time O(mn), improving the Õ(n ) algorithm of Megiddo (198)? Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 0/1

49 The end Thank you for listening! Hansen, Kaplan, and Zwick Dantzig s rule for SSSP, DMDPs, and min-ratio-cycles 1/1

A subexponential lower bound for the Random Facet algorithm for Parity Games

A subexponential lower bound for the Random Facet algorithm for Parity Games Oliver Friedmann 1 Thomas Dueholm Hansen 2 Uri Zwick 3 1 Department of Computer Science, University of Munich, Germany. 2 Center