Expectations or Guarantees? I Want It All! A Crossroad between Games and MDPs

Size: px

Start display at page:

Download "Expectations or Guarantees? I Want It All! A Crossroad between Games and MDPs"

Vincent Paul
5 years ago
Views:

1 Expectations or Guarantees? I Want It All! A Crossroad between Games and MDPs V. Bruyère (UMONS) E. Filiot (ULB) M. Randour (UMONS-ULB) J.-F. Raskin (ULB) Grenoble SR 04 - nd International Workshop on Strategic Reasoning

2 The talk in two slides (/) Verification and synthesis: a reactive system to control, an interacting environment, a specification to enforce. Focus on quantitative properties. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

3 The talk in two slides (/) Verification and synthesis: a reactive system to control, an interacting environment, a specification to enforce. Focus on quantitative properties. Several ways to look at the interactions, and in particular, the nature of the environment. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

4 The talk in two slides (/) Games antagonistic adversary guarantees on worst-case MDPs stochastic adversary optimize expected value Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

5 The talk in two slides (/) Games antagonistic adversary guarantees on worst-case MDPs stochastic adversary optimize expected value BWC synthesis ensure both Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

6 The talk in two slides (/) Games antagonistic adversary guarantees on worst-case MDPs stochastic adversary optimize expected value BWC synthesis ensure both Mean-Payoff Studied value functions Shortest Path Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

7 Advertisement Featured in STACS 4 [BFRR4] Full paper available on arxiv: abs/ Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

8 Context BWC Synthesis 3 Mean-Payoff 4 Shortest Path Conclusion Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 4 / 6

9 Context BWC Synthesis 3 Mean-Payoff 4 Shortest Path Conclusion Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

10 Quantitative games on graphs 4 7 Graph G = (S, E, w) with w : E Z Two-player game G = (G, S, S ) P states = P states = Plays have values f : Plays(G) R {, } Players follow strategies λ i : Prefs i (G) D(S) Finite memory stochastic output Moore machine M(λ i ) = (Mem, m 0, α u, α n ) Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

11 Quantitative games on graphs 4 7 Graph G = (S, E, w) with w : E Z Two-player game G = (G, S, S ) P states = P states = Plays have values f : Plays(G) R {, } Players follow strategies λ i : Prefs i (G) D(S) Finite memory stochastic output Moore machine M(λ i ) = (Mem, m 0, α u, α n ) Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

12 Quantitative games on graphs 4 7 Graph G = (S, E, w) with w : E Z Two-player game G = (G, S, S ) P states = P states = Plays have values f : Plays(G) R {, } Players follow strategies λ i : Prefs i (G) D(S) Finite memory stochastic output Moore machine M(λ i ) = (Mem, m 0, α u, α n ) Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

13 Quantitative games on graphs 4 7 Graph G = (S, E, w) with w : E Z Two-player game G = (G, S, S ) P states = P states = Plays have values f : Plays(G) R {, } Players follow strategies λ i : Prefs i (G) D(S) Finite memory stochastic output Moore machine M(λ i ) = (Mem, m 0, α u, α n ) Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

14 Quantitative games on graphs 4 7 Graph G = (S, E, w) with w : E Z Two-player game G = (G, S, S ) P states = P states = Plays have values f : Plays(G) R {, } Players follow strategies λ i : Prefs i (G) D(S) Finite memory stochastic output Moore machine M(λ i ) = (Mem, m 0, α u, α n ) Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

15 Quantitative games on graphs 4 7 Graph G = (S, E, w) with w : E Z Two-player game G = (G, S, S ) P states = P states = Plays have values f : Plays(G) R {, } Players follow strategies λ i : Prefs i (G) D(S) Finite memory stochastic output Moore machine M(λ i ) = (Mem, m 0, α u, α n ) Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

16 Quantitative games on graphs 7 4 Then, (,, ) ω Graph G = (S, E, w) with w : E Z Two-player game G = (G, S, S ) P states = P states = Plays have values f : Plays(G) R {, } Players follow strategies λ i : Prefs i (G) D(S) Finite memory stochastic output Moore machine M(λ i ) = (Mem, m 0, α u, α n ) Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

17 Markov decision processes 4 7 MDP P = (G, S, S, ) with : S D(S) P states = stochastic states = MDP = game + strategy of P P = G[λ ] Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 7 / 6

18 Markov chains MC M = (G, δ) with δ : S D(S) 4 MC = MDP + strategy of P = game + both strategies M = P[λ ] = G[λ, λ ] Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 8 / 6

19 Markov chains MC M = (G, δ) with δ : S D(S) MC = MDP + strategy of P = game + both strategies M = P[λ ] = G[λ, λ ] Event A Plays(G) probability P M s init (A) Measurable f : Plays(G) R {, } expected value E M s init (f ) Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 8 / 6

20 Classical interpretations System trying to ensure a specification = P whatever the actions of its environment Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 9 / 6

21 Classical interpretations System trying to ensure a specification = P whatever the actions of its environment The environment can be seen as antagonistic two-player game, worst-case threshold problem for µ Q? λ Λ, λ Λ, π Outs G (s init, λ, λ ), f (π) µ Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 9 / 6

22 Classical interpretations System trying to ensure a specification = P whatever the actions of its environment The environment can be seen as antagonistic two-player game, worst-case threshold problem for µ Q? λ Λ, λ Λ, π Outs G (s init, λ, λ ), f (π) µ fully stochastic MDP, expected value threshold problem for ν Q? λ Λ, E P[λ ] s init (f ) ν Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 9 / 6

23 Context BWC Synthesis 3 Mean-Payoff 4 Shortest Path Conclusion Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 0 / 6

24 What if you want both? In practice, we want both nice expected performance in the everyday situation, strict (but relaxed) performance guarantees even in the event of very bad circumstances. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

25 Example: going to work back home delay 0 train station 9 0 home departs 3 wait 4 light 0 0 car traffic 7 0 medium 30 0 heavy 70 bicycle 4 Weights = minutes Goal: minimize our expected time to reach work But, important meeting in one hour! Requires strict guarantees on the worst-case reaching time. waiting room work Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

26 Example: going to work back home delay 0 train station 9 0 home departs 3 wait 4 light 0 0 car traffic 7 0 medium 30 0 heavy 70 bicycle 4 Optimal expectation strategy: take the car. E = 33, WC = 7 > 60. Optimal worst-case strategy: bicycle. E = WC = 4 < 60. waiting room work Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

27 Example: going to work back home delay 0 train station waiting room 9 0 home departs 3 wait 4 light 0 0 car traffic 7 0 medium 30 work 0 heavy 70 bicycle 4 Optimal expectation strategy: take the car. E = 33, WC = 7 > 60. Optimal worst-case strategy: bicycle. E = WC = 4 < 60. Sample BWC strategy: try train up to 3 delays then switch to bicycle. E 37.6, WC = 9 < 60. Optimal E under WC constraint Uses finite memory Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

28 Beyond worst-case synthesis Formal definition Given a game G = (G, S, S ), with G = (S, E, w) its underlying graph, an initial state s init S, a finite-memory stochastic model λ stoch Λ F of the adversary, represented by a stochastic Moore machine, a measurable value function f : Plays(G) R {, }, and two rational thresholds µ, ν Q, the beyond worst-case (BWC) problem asks to decide if P has a finite-memory strategy λ Λ F such that { λ Λ, π Outs G (s init, λ, λ ), f (π) > µ () E G[λ,λ stoch ] s init (f ) > ν () and the BWC synthesis problem asks to synthesize such a strategy if one exists. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

29 Beyond worst-case synthesis Formal definition Given a game G = (G, S, S ), with G = (S, E, w) its underlying graph, an initial state s init S, a finite-memory stochastic model λ stoch Λ F of the adversary, represented by a stochastic Moore machine, a measurable value function f : Plays(G) R {, }, and two rational thresholds µ, ν Q, the beyond worst-case (BWC) problem asks to decide if P has a finite-memory strategy λ Λ F such that { λ Λ, π Outs G (s init, λ, λ ), f (π) > µ () E G[λ,λ stoch ] s init (f ) > ν () and the BWC synthesis problem asks to synthesize such a strategy if one exists. Notice the highlighted parts! Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

30 Related work Common philosophy: avoiding outlier outcomes Our strategies are strongly risk averse avoid risk at all costs and optimize among safe strategies Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 4 / 6

31 Related work Common philosophy: avoiding outlier outcomes Our strategies are strongly risk averse avoid risk at all costs and optimize among safe strategies Other notions of risk ensure low probability of risked behavior [WL99, FKR9] without worst-case guarantee without good expectation Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 4 / 6

32 Related work Common philosophy: avoiding outlier outcomes Our strategies are strongly risk averse avoid risk at all costs and optimize among safe strategies Other notions of risk ensure low probability of risked behavior [WL99, FKR9] without worst-case guarantee without good expectation 3 Trade-off between expectation and variance [BCFK3, MT] statistical measure of the stability of the performance no strict guarantee on individual outcomes Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 4 / 6

33 Context BWC Synthesis 3 Mean-Payoff 4 Shortest Path Conclusion Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

34 Mean-payoff value function MP(π) = lim inf n [ i=n n w ( (s i, s i+ ) )] i=0 Sample play π =,, 4,, (,, ) ω MP(π) = 3 long-run average weight prefix-independent Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

35 Mean-payoff value function MP(π) = lim inf n [ i=n n w ( (s i, s i+ ) )] i=0 Sample play π =,, 4,, (,, ) ω MP(π) = 3 long-run average weight prefix-independent worst-case expected value BWC complexity NP conp P NP conp memory memoryless memoryless pseudo-polynomial [LL69, EM79, ZP96, Jur98, GS09, Put94, FV97] Additional modeling power for free! Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

36 Philosophy of the algorithm Classical worst-case and expected value results and algorithms as nuts and bolts Screw them together in an adequate way Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 7 / 6

37 Philosophy of the algorithm Classical worst-case and expected value results and algorithms as nuts and bolts Screw them together in an adequate way Three key ideas To characterize the expected value, look at end-components (ECs) Winning ECs vs. losing ECs: the latter must be avoided to preserve the worst-case requirement! 3 Inside a WEC, we have an interesting way to play... Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 7 / 6

38 Philosophy of the algorithm Classical worst-case and expected value results and algorithms as nuts and bolts Screw them together in an adequate way Three key ideas To characterize the expected value, look at end-components (ECs) Winning ECs vs. losing ECs: the latter must be avoided to preserve the worst-case requirement! 3 Inside a WEC, we have an interesting way to play... = Let s focus on an ideal case Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 7 / 6

39 An ideal situation s s 6 s Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 8 / 6

40 An ideal situation s s 6 s Game interpretation Worst-case threshold is µ = 0 All states are winning: memoryless optimal worst-case strategy λ wc Λ PM (G), ensuring µ = > 0 Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 8 / 6

41 An ideal situation s s 6 s Game interpretation Worst-case threshold is µ = 0 All states are winning: memoryless optimal worst-case strategy λ wc Λ PM (G), ensuring µ = > 0 MDP interpretation Memoryless optimal expected value strategy λ e ΛPM (P) achieves ν = Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 8 / 6

42 A cornerstone of our approach s s 6 s BWC problem: what kind of threholds (0, ν) can we achieve? Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 8 / 6

43 A cornerstone of our approach s s 6 s BWC problem: what kind of threholds (0, ν) can we achieve? Key result For all ε > 0, there exists a finite-memory strategy of P that satisfies the BWC problem for the thresholds pair (0, ν ε). We can be arbitrarily close to the optimal expectation while ensuring the worst-case! Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 8 / 6

44 Combined strategy s s 6 s Outcomes of the form > 0 > 0 0 compensate K steps L steps > 0 0 compensate WC > 0 E =?? Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 8 / 6

45 Combined strategy s s 6 s Outcomes of the form > 0 > 0 0 K steps L steps What we want compensate > 0 0 compensate K, L WC > 0 E =?? E = ν = Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 8 / 6

46 Combined strategy: crux of the proof Precise reasoning on convergence rates using involved techniques When K grows, L needs to grow linearly to ensure WC Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 9 / 6

47 Combined strategy: crux of the proof Precise reasoning on convergence rates using involved techniques When K grows, L needs to grow linearly to ensure WC When K grows, P( ) 0 and it decreases exponentially fast application of Chernoff bounds and Hoeffding s inequality for Markov chains [Tra09, GO0] Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 9 / 6

48 Combined strategy: crux of the proof Precise reasoning on convergence rates using involved techniques When K grows, L needs to grow linearly to ensure WC When K grows, P( ) 0 and it decreases exponentially fast application of Chernoff bounds and Hoeffding s inequality for Markov chains [Tra09, GO0] Overall we are good: WC > 0 and E > ν ε for sufficiently large K, L. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 9 / 6

49 Context BWC Synthesis 3 Mean-Payoff 4 Shortest Path Conclusion Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 0 / 6

50 Shortest path - truncated sum Assume strictly positive integer weights, w : E N 0 Let T S be a target set that P wants to reach with a path of bounded value (cf. introductory example) inequalities are reversed, ν < µ TS T (π = s 0 s s... ) = n i=0 w((s i, s i+ )), with n the first index such that s n T, and TS T (π) = if n, s n T Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

51 Shortest path - truncated sum Assume strictly positive integer weights, w : E N 0 Let T S be a target set that P wants to reach with a path of bounded value (cf. introductory example) inequalities are reversed, ν < µ TS T (π = s 0 s s... ) = n i=0 w((s i, s i+ )), with n the first index such that s n T, and TS T (π) = if n, s n T worst-case expected value BWC complexity P P pseudo-poly. / NP-hard memory memoryless memoryless pseudo-poly. [BT9, da99] Problem inherently harder than worst-case and expectation. NP-hardness by K th largest subset problem [JK78, GJ79] Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

52 Key difference with MP case Useful observation The set of all worst-case winning strategies for the shortest path can be represented through a finite game. Sequential approach solving the BWC problem: represent all WC winning strategies, optimize the expected value within those strategies. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

53 Pseudo-polynomial algorithm: sketch s s s 3 Start from G = (G, S, S ), G = (S, E, w), T = {s 3 }, M(λ stoch ), µ = 8, and ν Q Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

54 Pseudo-polynomial algorithm: sketch s s s 3 Start from G = (G, S, S ), G = (S, E, w), T = {s 3 }, M(λ stoch ), µ = 8, and ν Q Build G by unfolding G, tracking the current sum up to the worst-case threshold µ, and integrating it in the states of G. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

55 Pseudo-polynomial algorithm: sketch s s s 3 s, 0 s, s3, Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

56 Pseudo-polynomial algorithm: sketch s s s 3 s, 0 s, s, s3, s3, Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

57 Pseudo-polynomial algorithm: sketch s s s 3 s, 0 s, s, s, 3 s3, s3, s3, 7 Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

58 Pseudo-polynomial algorithm: sketch s s s 3 s, 0 s, s, s, 3 s, 4 s3, s3, s3, 7 s3, 4 Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

59 Pseudo-polynomial algorithm: sketch s s s 3 s, 0 s, s, s, 3 s, 4 s, s3, s3, s3, 7 s3, 4 s3, Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

60 Pseudo-polynomial algorithm: sketch s s s 3 s, 0 s, s, s, 3 s, 4 s, s, 6 s3, s3, s3, 7 s3, 4 s3, s3, 6 Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

61 Pseudo-polynomial algorithm: sketch s s s 3 s, 0 s, s, s, 3 s, 4 s, s, 6 s, 7 s, s3, s3, s3, 7 s3, 4 s3, s3, 6 Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

62 Pseudo-polynomial algorithm: sketch 3 Compute R, the attractor of T with cost < µ = 8 4 Consider G µ = G R s, 0 s, s, s, 3 s, 4 s, s, 6 s, 7 s, s3, s3, s3, 7 s3, 4 s3, s3, 6 Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

63 Pseudo-polynomial algorithm: sketch 3 Compute R, the attractor of T with cost < µ = 8 4 Consider G µ = G R s, 0 s, s, s3, s3, s3, 7 Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

64 Pseudo-polynomial algorithm: sketch Consider P = G µ M(λ stoch ) 6 Compute memoryless optimal expectation strategy 7 If ν < ν, answer Yes, otherwise answer No s, 0 s, s, Here, ν = 9/ s3, s3, s3, 7 Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 3 / 6

65 Context BWC Synthesis 3 Mean-Payoff 4 Shortest Path Conclusion Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 4 / 6

66 In a nutshell BWC framework combines worst-case and expected value requirements a natural wish in many practical applications few existing theoretical support Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

67 In a nutshell BWC framework combines worst-case and expected value requirements a natural wish in many practical applications few existing theoretical support Mean-payoff: additional modeling power for no complexity cost (decision-wise) Shortest path: harder than the worst-case, pseudo-polynomial with NP-hardness result Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

68 In a nutshell BWC framework combines worst-case and expected value requirements a natural wish in many practical applications few existing theoretical support Mean-payoff: additional modeling power for no complexity cost (decision-wise) Shortest path: harder than the worst-case, pseudo-polynomial with NP-hardness result In both cases, pseudo-polynomial memory is both sufficient and necessary but strategies have natural representations based on states of the game and simple integer counters Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin / 6

69 Beyond BWC synthesis? Possible future works include study of other quantitative objectives, extension of our results to more general settings (multi-dimension [CDHR0, CRR], decidable classes of games with imperfect information [DDG + 0], etc), application of the BWC problem to various practical cases. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

70 Beyond BWC synthesis? Possible future works include study of other quantitative objectives, extension of our results to more general settings (multi-dimension [CDHR0, CRR], decidable classes of games with imperfect information [DDG + 0], etc), application of the BWC problem to various practical cases. Thanks! Do not hesitate to discuss with us! Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 6 / 6

71 References I T. Brázdil, K. Chatterjee, V. Forejt, and A. Kucera. Trading performance for stability in Markov decision processes. In Proc. of LICS, pages IEEE Computer Society, 03. V. Bruyère, E. Filiot, M. Randour, and J.-F. Raskin. Meet your expectations with guarantees: beyond worst-case synthesis in quantitative games. In Proc. of STACS, LIPIcs, pages Schloss Dagstuhl - LZI, 04. D.P. Bertsekas and J.N. Tsitsiklis. An analysis of stochastic shortest path problems. Mathematics of Operations Research, 6:80 9, 99. K. Chatterjee, L. Doyen, T.A. Henzinger, and J.-F. Raskin. Generalized mean-payoff and energy games. In Proc. of FSTTCS, LIPIcs 8, pages 0 6. Schloss Dagstuhl - LZI, 00. K. Chatterjee, L. Doyen, M. Randour, and J.-F. Raskin. Looking at mean-payoff and total-payoff through windows. In Proc. of ATVA, LNCS 87, pages 8 3. Springer, 03. K. Chatterjee, M. Randour, and J.-F. Raskin. Strategy synthesis for multi-dimensional quantitative objectives. In Proc. of CONCUR, LNCS 744, pages 3. Springer, 0. L. de Alfaro. Computing minimum and maximum reachability times in probabilistic systems. In Proc. of CONCUR, LNCS 664, pages Springer, 999. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 7 / 6

72 References II A. Degorre, L. Doyen, R. Gentilini, J.-F. Raskin, and S. Torunczyk. Energy and mean-payoff games with imperfect information. In Proc. of CSL, LNCS 647, pages Springer, 00. A. Ehrenfeucht and J. Mycielski. Positional strategies for mean payoff games. Int. Journal of Game Theory, 8():09 3, 979. J.A. Filar, D. Krass, and K.W. Ross. Percentile performance criteria for limiting average Markov decision processes. Transactions on Automatic Control, pages 0, 99. J. Filar and K. Vrieze. Competitive Markov decision processes. Springer, 997. M.R. Garey and D.S. Johnson. Computers and intractability: a guide to the Theory of NP-Completeness. Freeman New York, 979. P.W. Glynn and D. Ormoneit. Hoeffding s inequality for uniformly ergodic Markov chains. Statistics & Probability Letters, 6():43 46, 00. T. Gawlitza and H. Seidl. Games through nested fixpoints. In Proc. of CAV, LNCS 643, pages Springer, 009. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 8 / 6

73 References III D.B. Johnson and S.D. Kashdan. Lower bounds for selection in X + Y and other multisets. Journal of the ACM, (4):6 70, 978. M. Jurdziński. Deciding the winner in parity games is in UP co-up. Inf. Process. Lett., 68(3):9 4, 998. T.M. Liggett and S.A. Lippman. Stochastic games with perfect information and time average payoff. Siam Review, (4): , 969. S. Mannor and J.N. Tsitsiklis. Mean-variance optimization in Markov decision processes. In Proc. of ICML, pages Omnipress, 0. M.L. Puterman. Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, Inc., New York, NY, USA, st edition, 994. M. Tracol. Fast convergence to state-action frequency polytopes for MDPs. Oper. Res. Lett., 37():3 6, 009. C. Wu and Y. Lin. Minimizing risk models in Markov decision processes with policies depending on target values. Journal of Mathematical Analysis and Applications, 3():47 67, 999. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 9 / 6

74 References IV U. Zwick and M. Paterson. The complexity of mean payoff games on graphs. Theoretical Computer Science, 8:343 39, 996. Beyond Worst-Case Synthesis Bruyère, Filiot, Randour, Raskin 30 / 6

Looking at Mean-Payoff and Total-Payoff through Windows

Looking at Mean-Payoff and Total-Payoff through Windows Krishnendu Chatterjee 1,, Laurent Doyen 2, Mickael Randour 3,, and Jean-François Raskin 4, 1 IST Austria (Institute of Science and Technology Austria)