Nash Equilibrium for Upward-Closed Objectives Krishnendu Chatterjee Report No. UCB/CSD August 2005 Computer Science Division (EECS) University

Size: px
Start display at page:

Download "Nash Equilibrium for Upward-Closed Objectives Krishnendu Chatterjee Report No. UCB/CSD August 2005 Computer Science Division (EECS) University"

Transcription

1 Nash Equilibrium for Upward-Closed Objectives Krishnendu Chatterjee Report No. UCB/CSD August 2005 Computer Science Division (EECS) University of California Berkeley, California 94720

2 Nash Equilibrium for Upward-Closed Objectives Λ Krishnendu Chatterjee Department of Electrical Engineering and Computer Sciences University of California, Berkeley, USA c krish@cs.berkeley.edu August 2005 Abstract We study infinite stochastic games played by n-players on a finite graph with goals specified by sets of infinite traces. The games are concurrent (each player simultaneously and independently chooses an action at each round), stochastic (the next state is determined by a probability distribution depending on the current state and the chosen actions), infinite (the game continues for an infinite number of rounds), nonzero-sum (the players' goals are not necessarily conflicting), and undiscounted. We show that if each player has an upwardclosed objective, then there exists an "-Nash equilibrium in memoryless strategies, for every ">0; and exact Nash equilibria need not exist. Upward-closure of an objective means that if a set Z of infinitely repeating states is winning, then all supersets of Z of infinitely repeating states are also winning. Memoryless strategies are strategies that are independent of history of plays and depend only on the current state. We also study the complexity of finding values (payoff profile) of an "-Nash equilibrium. We show that the values of an "-Nash equilibrium in nonzero-sum concurrent games with upward-closed objectives for all players can be computed by computing "-Nash equilibrium values of nonzero-sum concurrent games with reachability objectives for all players and a polynomial procedure. As a consequence we establish that values of an "-Nash equilibrium can be computed in TFNP (total functional NP), and hence in EXPTIME. Λ This research was supported in part by the ONR grant N , the AFOSR MURI grant F , and the NSF grant CCR

3 1 Introduction Stochastic games. Non-cooperative games provide a natural framework to model interactions between agents [16, 18]. The simplest class of noncooperative games consists of the one-step" games games with single interaction between the agents after which the game ends and the payoffs are decided (e.g., matrix games). However, a wide class of games progress over time and in stateful manner, and the current game depends on the history of interactions. Infinite stochastic games [20, 9] are a natural model for such games. A stochastic game is played over a finite state space and is played in rounds. In concurrent games, in each round, each player chooses an action from a finite set of available actions, simultaneously and independently of other players. The game proceeds to a new state according to a probabilistic transition relation (stochastic transition matrix) based on the current state and the joint actions of the players. Concurrent games subsume the simpler class of turn-based games, where at every state at most one player can choose between multiple actions. In verification and control of finite state reactive systems such games proceed for infinite rounds, generating an infinite sequence of states, called the outcome of the game. The players receive apayoff based on a payoff function that maps every outcome to a real number. Objectives. Payoffs are generally Borel measurable functions [15]. The payoff set for each player is a Borel set B i in the Cantor topology on S! (where S is the set of states), and player i gets payoff 1 if the outcome of the game is in B i, and 0 otherwise. In verification, payoff functions are usually index sets of!-regular languages. The!-regular languages generalize the classical regular languages to infinite strings, they occur in low levels of the Borel hierarchy (they are in ± 3 Π 3 ), and they form a robust and expressive language for determining payoffs for commonly used specifications. The simplest!-regular objectives correspond to safety ( closed sets") and reachability ( open sets") objectives. Zero-sum games. Games may be zero-sum, where two players have directly conflicting objectives and the payoff of one player is one minus the payoff of the other, or nonzero-sum, where each player has a prescribed payoff function based on the outcome of the game. The fundamental question for games is the existence of equilibrium values. For zero-sum games, this involves showing a determinacy theorem that states that the expected optimum value obtained by player 1 is exactly one minus the expected optimum value obtained by player 2. For one-step zero-sum games, this is von 2

4 Neumann's minmax theorem [25]. For infinite games, the existence of such equilibria is not obvious, in fact, by using the axiom of choice, one can construct games for which determinacy does not hold. However, a remarkable result by Martin [15] shows that all stochastic zero-sum games with Borel payoffs are determined. Nonzero-sum games. For nonzero-sum games, the fundamental equilibrium concept is a Nash equilibrium [11], that is, a strategy profile such that no player can gain by deviating from the profile, assuming the other player continues playing the strategy in the profile. Again, for one-step games, the existence of such equilibria is guaranteed by Nash's theorem [11]. However, the existence of Nash equilibria in infinite games is not immediate: Nash's theorem holds for finite bimatrix games, but in case of stochastic games, the strategy space is not compact. The existence of Nash equilibria is known only in very special cases of stochastic games. In fact, Nash equilibria may not exist, and the best one can hope for is an "-Nash equilibrium for all " > 0, where an "-Nash equilibrium is a strategy profile where unilateral deviation can only increase the payoff of a player by at most ". Exact Nash equilibria do exist in discounted stochastic games [10]. For concurrent nonzero-sum games with payoffs defined by Borel sets, surprisingly little is known. Secchi and Sudderth [19] showed that exact Nash equilibria do exist when all players have payoffs defined by closed sets ( safety objectives" or Π 1 objectives). In the case of open sets ( reachability objectives" or ± 1 objectives), the existence of "-Nash equilibrium for every " > 0, has been established in [4]. For the special case of two-player games, existence of "-Nash equilibrium, for every " > 0, is known for!-regular objectives [2] and limit-average objectives [23, 24]. The existence of "-Nash equilibrium in n-player concurrent games with objectives in higher levels of Borel hierarchy than ± 1 and Π 1 has been an intriguing open problem; existence of "-Nash equilibrium is not even known even when each player has a Büchi objective. Result and proof techniques. In this paper we show that "-Nash equilibrium exists, for every ">0, for n-player concurrent games with upwardclosed objectives. However, exact Nash equilibria need not exist. Informally, an objective Ψisanupward-closed objective, if a play! that visits a set Z of states infinitely often is in Ψ, then a play! 0 that visits Z 0 Z of states infinitely often is also in Ψ. The class of upward-closed objectives subsumes Büchi and generalized Büchi objectives as special cases. For n-player concurrent games our result extends the existence of "-Nash equilibrium from the lowest level of Borel hierarchy (open and closed sets) to a class of objectives that lie in the higher levels of Borel hierarchy (upward-closed objectives lie 3

5 in Π 2 ) and subsumes several interesting class of objectives. Along with the existence of "-Nash equilibrium, our result presents a finer characterization of "-Nash equilibrium showing existence of "-Nash equilibrium in memoryless strategies (strategies that are independent of the history of the play and depend only on the current state). Our result is organized as follows: 1. In Section 3 we develop some results on one player version of concurrent games and n-player concurrent games with reachability objectives. 2. In Section 4 we use induction on the number of players, results of Section 3 and analysis of Markov chains to establish the desired result. Complexity of "-Nash equilibrium. Computing the values of a Nash equilibria, when it exists, is another challenging problem [17, 26]. For onestep zero-sum games, equilibrium values and strategies can be computed in polynomial time (by reduction to linear programming) [16]. For one-step nonzero-sum games, no polynomial time algorithm is known to compute an exact Nash equilibrium, even in two-player games [17]. From the computational aspects, a desirable property of an existence proof of Nash equilibrium is its ease of algorithmic analysis. We show that our proof for existence of "-Nash equilibrium is completely constructive and algorithmic. Our proof shows that the computation of an "-Nash equilibrium in n-player concurrent games with upward-closed objectives can be achieved by computing "-Nash equilibrium of games with reachability objectives and a polynomial time procedure. Our result thus shows that computing "-Nash equilibrium for upward-closed objectives is no harder than solving "-Nash equilibrium of n-player games with reachability objectives by a polynomial factor. We then prove that an "-Nash equilibrium can be computed in TFNP (total functional NP) and hence in EXPTIME. 2 Definitions Notation. For a countable P set A, aprobability distribution on A is a function ffi : A! [0; 1] such that a2a ffi(a) =1. We denote the set of probability distributions on A by D(A). Given a distribution ffi 2D(A), we denote by Supp(ffi) =fx 2 A j ffi(x) > 0g the support of ffi. Definition 1 (Concurrent game structures) An n-player concurrent game structure G = hs; A; 1 ; 2 ;:::; n ;ffii consists of the following components: 4

6 ffl A finite state space S and a finite set Aofmoves. ffl Move assignments 1 ; 2 ;:::; n : S! 2 A n;. For i 2f1; 2;:::;ng, move assignment i associates with each state s 2 S the non-empty set i (s) Aofmoves available to player i at state s. ffl A probabilistic transition function ffi : S A A ::: A! D(S), that gives the probability ffi(s; a 1 ;a 2 ;:::;a n )(t) of a transition from s to t when player i plays move a i, for all s; t 2 S and a i 2 i (s), for i 2f1; 2;:::;ng. We define the size of the game structure G to be equal to the size of the transition function ffi; specifically, jgj = X s2s X (a 1 ;a 2 ;:::;a n)2 1 (s) 2 (s) ::: n(s) X t2s jffi(s; a 1 ;a 2 ;:::;a n )(t)j; where jffi(s; a 1 ;a 2 ;:::;a n )(t)j denotes the space to specify the probability distribution. At every state s 2 S, each player i chooses a move a i 2 i (s), and simultaneously and independently of the other players, and the game then proceeds to the successor state t with probability ffi(s; a 1 ;a 2 ;:::;a n )(t), for all t 2 S. A state s is called an absorbing state if for all a i 2 i (s) we have ffi(s; a 1 ;a 2 ;:::;a n )(s) =1. In other words, at s for all choices of moves of the players the next state is always s. For all states s 2 S and moves a i 2 i (s) we indicate by Dest(s; a 1 ;a 2 ;:::;a n ) = Supp(ffi(s; a 1 ;a 2 ;:::;a n )) the set of possible successors of s when moves a 1 ;a 2 ;:::;a n are selected. A path or a play! of G is an infinite sequence! = hs 0 ;s 1 ;s 2 ;:::i of states in S such that for all k 0, there are moves a k i 2 i (s k ) and with ffi(s k ;a k 1 ;ak 2 ;:::;ak n)(s k+1 ) > 0. We denote by Ω the set of all paths and by Ω s the set of all paths! = hs 0 ;s 1 ;s 2 ;:::i such that s 0 = s, i.e., the set of plays starting from state s. Randomized strategies. A selector ο i for player i 2 f 1; 2;:::;ng is a function ο i : S!D(A) such that for all s 2 S and a 2 A, ifο i (s)(a) > 0 then a 2 i (s). We denote byλ i the set of all selectors for player i 2f1; 2;:::;ng. A strategy ff i for player i is a function ff i : S +! Λ i that associates with every finite non-empty sequence of states, representing the history of the play so far, a selector. A memoryless strategy is independent of the history of the play and depends only on the current state. Memoryless strategies coincide with selectors, and we often write ff i for the selector corresponding to a memoryless strategy ff i. A memoryless strategy ff i for player i is uniform memoryless if the selector of the memoryless strategy is an uniform 5

7 distribution over its support, i.e., for all states s we have ff i (s)(a i ) = 0 if 1 a i 62 Supp(ff i (s)) and ff i (s)(a i ) = jsupp(ff i (s))j if a i 2 Supp(ff i (s)). We denote by ± i, ± M i and ± UM i the set of all strategies, set of all memoryless strategies and the set of all uniform memoryless strategies for player i, respectively. Given strategies ff i for player i, we denote by ff the strategy profile hff 1 ;ff 2 ;:::;ff n i. A strategy profile ff is memoryless (resp. uniform memoryless) if all the component strategies are memoryless (resp. uniform memoryless). Given a strategy profile ff = (ff 1 ;ff 2 ;:::;ff n ) and a state s, we denote by Outcome(s; ff) = f! = hs 0 ;s 1 ;s 2 :::i j s 0 = s and 9a k i :ff i(hs 0 ;s 1 ;:::;s k i)(a k i ) > 0: and ffi(s k;a k 1 ;ak 2 ;:::;ak n)(s k+1 ) > 0g the set of all possible plays from s, given ff. Once the starting state s and the strategies ff i for the players have been chosen, the game is reduced to an ordinary stochastic process. Hence, the probabilities of events are uniquely defined, where an event A Ω s is a measurable set of paths. For an event A Ω s, we denote by Pr ff s (A) the probability that a path belongs to A when the game starts from s and the players follow the strategies ff i, and ff = hff 1 ;ff 2 ;:::;ff n i. Objectives. Objectives for the players in nonterminating games are specified by providing the set of winning plays Ψ Ω for each player. A general class of objectives are the Borel objectives [13]. A Borel objective Φ S! is a Borel set in the Cantor topology on S!. The class of!-regular objectives [21], lie in the first 2 1 = 2 levels of the Borel hierarchy (i.e., in the intersection of ± 3 and Π 3 ). The!-regular objectives, and subclasses thereof, can be specified in the following forms. For a play! = hs 0 ;s 1 ;s 2 ;:::i2ω, we define Inf(!) =f s 2 S j s k = s for infinitely many k 0 g to be the set of states that occur infinitely often in!. 1. Reachability and safety objectives. Given a game graph G, and a set T S of target states, the reachability specification Reach(T ) requires that some state in T be visited. The reachability specification Reach(T ) defines the objective [[Reach(T )] = fhs 0 ;s 1 ;s 2 ;:::i 2 Ω j 9k 0: s k 2 T g of winning plays. Given a set F S of safe states, the safety specification Safe(F ) requires that only states in F be visited. The safety specification Safe(F ) defines the objective [Safe(F )] =fhs 0 ;s 1 ;:::i2ω j8k 0: s k 2 F g of winning of plays. 2. Büchi and generalized Büchi objectives. Given a game graph G, and a set B S of Büchi states, the Büchi specification Büchi(B) requires that states in B be visited infinitely often. The Büchi specification 6

8 Büchi(B) defines the objective [Büchi(B)] = f! 2 Ω j Inf(!) B 6= ;g of winning plays. Let B 1 ;B 2 ;:::;B n be subset of states, i.e., each B i S. The generalized Büchi specification is the requires that every Büchi specification Büchi(B i ) be satisfied. Formally, the generalized Büchi objective is T i2f 1;2;:::;n g [Büchi(B i )]. 3. Müller and upward-closed objectives. Given a set M 2 S of Müller set of states, the Müller specification Müller(M) requires that the set of states visited infinitely often in a play is exactly one of the sets in M. The Müller specification Müller(M) defines the objective [Müller(M)]] =f! 2 Ω j Inf(!) 2 M g of winning plays. The upwardclosed objectives form a sub-class of Müller objectives, with the restriction that the set M is upward-closed. Formally a set UC 2 S is upward-closed if the following condition hold: if U 2 UC and U Z, then Z 2 UC. Givenaupward-closed set UC 2 S, the upward-closed objective is defined as the set [[UpClo(UC )] =f! 2 Ω j Inf(!) 2 UC g of winning plays. Observe that the upward-closed objectives specifies that if a play! that visits a subset U of states visited infinitely often is winning, then a play! 0 that visits a superset of U of states infinitely often is also winning. The upward-closed objectives subsumes Büchi and generalized Büchi (i.e., conjunction of Büchi) objectives. The upward-closed objectives also subsumes disjunction of Büchi objectives. Since the Büchi objectives lie in the second level of the Borel hierarchy (in Π 2 ), it follows that upward-closed objectives can express objectives that lie in Π 2. Müller objectives are canonical forms to express!-regular objectives, and the class of upward-closed objectives form a strict subset of Müller objectives and cannot express all!-regular properties. We write Ψ for an arbitrary objective. We write the objective of player i as Ψ i. Given a Müller objective Ψ, the set of paths Ψ is measurable for any choice of strategies for the players [22]. Hence, the probability that a path satisfies a Müller objective Ψ starting from state s 2 S under a strategy profile ff is Pr ff s (Ψ). Notations. Given a strategy profile ff = (ff 1 ;ff 2 ;:::;ff n ), we denote by ff i = (ff 1 ;ff 2 ;:::;ff i 1 ;ff i+1 ;:::;ff n ) the strategy profile with the strategy for player i removed. Given a strategy ff 0 i 2 ± i, and a strategy profile ff i, we denote by ff i [ ff 0 i the strategy profile (ff 1;ff 2 ;:::;ff i 1 ;ff 0 i ;ff i+1;:::;ff n ). We also use the following notations: ±=± 1 ± 2 ::: ± n ; ± M =± M 1 ± M ::: 2 ±M n ; ± UM =± UM 1 ± UM 2 ::: ± UM n ; and ± i =± 1 ± 2 7

9 :::± i 1 ± i+1 :::± n. The notations for ± M i and ± UM i are similar. For n 2 N, we denote by [n] the set f 1; 2;:::;ng. Concurrent nonzero-sum games. A concurrent nonzero-sum game consists of a concurrent game structure G with objective Ψ i for player i. The zero-sum values for the players in concurrent games with objective Ψ i for player i are defined as follows. Definition 2 (Zero-sum values) Let G be a concurrent game structure with objective Ψ i for player i. Given a state s 2 S we call the maximal probability with which player i can ensure that Ψ i holds from s against all strategies of the other players is the zero-sum value of player i at s. Formally, the zero-sum value for player i is given by the function val G i (Ψ i):s! [0; 1] defined for all s 2 S by val G i (Ψ i)(s) = sup ff 0 i 2± i inf ff i 2± i Pr ff i[ff0 i s (Ψ i ): Atwo-player concurrent game structure G with objectives Ψ 1 and Ψ 2 for player 1 and player 2, respectively, iszero-sum if the objectives of the players are complementary, i.e., Ψ 1 =Ωn Ψ 2. Concurrent zero-sum games satisfy a quantitative version of determinacy [15], stating that for all two-player concurrent games with Müller objectives Ψ 1 and Ψ 2, such that Ψ 1 =ΩnΨ 2, and all s 2 S, wehave val G 1 (Ψ 1)(s)+val G 2 (Ψ 2)(s) =1: The determinacy also establishes existence of "-Nash equilibrium, for all ">0, in concurrent zero-sum games. Definition 3 ("-Nash equilibrium) Let G be a concurrent game structure with objective Ψ i for player i. For " 0, a strategy profile ff Λ = (ff Λ 1 ;:::;ffλ n) 2 ± is an "-Nash equilibrium for a state s 2 S iff the following condition hold for all i 2 [n]: sup ff i 2± i Pr ffλ [ff i i s (Ψ i )» Pr ffλ s (Ψ i )+": A Nash equilibrium is an "-Nash equilibrium with " =0. Example 1 ("-Nash equilibrium) Consider the two-player game structure shown in Fig. 1.(a). The state s 1 and s 2 are absorbing states and the 8

10 set of available moves for player 1 and player 2 at s 0 is f a; b g and f c; d g, respectively. The transition function is defined as follows: ffi(s 0 ;a;c)(s 0 )=1; ffi(s 0 ;b;d)(s 2 )=1; ffi(s 0 ;a;d)(s 1 )=ffi(s 0 ;b;c)(s 1 )=1: The objective of player 1 is an upward-closed objective [UpClo(UC 1 )] with UC 1 = f f s 1 g; f s 1 ;s 2 g; f s 0 ;s 1 g; f s 0 ;s 1 ;s 2 g g, i.e., the objective of player 1 is to visit s 1 infinitely often (i.e.,[[büchi(f s 1 g)]]). Since s 1 is an absorbing state in the game shown, the objective of player 1 is equivalent to [Reach(f s 1 g)]. The objective of player 2 is an upward-closed objective [UpClo(UC 2 )] with UC 2 = ffs 2 g; f s 1 ;s 2 g; f s 0 ;s 2 g; f s 0 ;s 1 ;s 2 g; f s 0 g; f s 0 ;s 1 gg. Observe that any play! such that Inf(!) 6= f s 1 g is winning for player 2. Hence the objective of player 1 and player 2 are complementary. For ">0, consider the memoryless strategy ff " 2 1 ±M that plays move a with probability 1 ", and move b with probability ". The game starts at s 0, and in each round if player 2 plays move c, then the play reaches s 1 with probability " and stays in s 0 with probability 1 "; whereas if player 2 plays move d, then the game reaches state s 1 with probability 1 " and state s 2 with probability ". Hence it is easy to argue against all strategies ff 2 for player 2, given the strategy ff " 1 of player 1, the game reaches s 1 with probability at least 1 ". Hence for all ">0, there exists a strategy ff " 1 for player 1, such that against all strategies ff 2, we have Pr ff" 1 ;ff 2 s 0 ([[Reach(f s 1 g)]]) 1 "; hence (1 "; ") is an "-Nash equilibrium value profile at s 0. However, we argue that (1; 0) is not an Nash equilibrium at s 0. To prove the claim, given a strategy ff 1 for player 1 consider the counter strategy ff 2 for player 2 as follows: for k 0, at round k, if player 1 plays move a with probability 1, then player 2 chooses the move c and ensures that the state s 1 is reached with probability 0; otherwise if player 1 plays move b with positive probability at round k, then player 2 chooses move d, and the play reaches s 2 with positive probability. That is either s 2 is reached with positive probability or s 0 is visited infinitely often. Hence player 1cannot satisfy [[UpClo(UC 1 )] with probability 1. This shows that in game structures with upward-closed objectives, Nash equilibrium need not exist and "-Nash equilibrium, for all ">0, is the best one can achieve. Consider the game shown in Fig. 1.(b). The transition function at state s 0 is same as in Fig 1.(a). The state s 2 is an absorbing state and from state s 1 the next state is always s 0. The objective for player 1 is same as in the previous example, i.e., [Büchi(f s 1 g)]. Consider any upward-closed objective [UpClo(UC 2 )] for player 2. We claim that the following strategy profile (ff 1 ;ff 2 ) is a Nash equilibrium at s 0 : ff 1 is memoryless strategy that plays a 9

11 ac ac s 0 s 0 ad,bc bd ac ad,bc bd s1 s 2 s 1 s 2 (a) (b) Figure 1: structures Examples of Nash equilibrium in two-player concurrent game with probability 1, and ff 2 is a memoryless strategy that plays c and d with probability 1 = 2. Given ff 1 and ff 2 we have that the states s 0 and s 1 are visited infinitely often with probability 1. If f s 0 ;s 1 g2uc 2, then the objectives of both players are satisfied with probability 1. If f s 0 ;s 1 g62uc 2, then no subset of fs 0 ;s 2 g is in UC 2. Given strategy ff 1, for all strategies ff 0 2 of player 2, the plays under ff 1 and ff 0 visits subsets of f s 2 0;s 1 g infinitely often. Hence, if f s 0 ;s 1 g62uc 2, then given ff 1, for all strategies of player 2, the objective [[UpClo(UC 2 )]] is satisfied with probability 0, hence player 2 has no incentive to deviate from ff 2. The claim follows. Note that the present example can be contrasted to the zero-sum game on the same game structure with objective [[Büchi(f s 1 g)]] for player 1 and the complementary objective for player 2 (which is not upward-closed). In the zero-sum case, "-optimal strategies require infinite-memory (see [7]) for player 1. In the case of nonzero-sum game with upward-closed objectives (which do not generalize the zero-sum case) we exhibited existence ofmemoryless Nash equilibrium. 3 Markov Decision Processes and Nash Equilibrium for Reachability Objectives The section is divided in two parts: subsection 3.1 develops facts about one player concurrent game structures and subsection 3.2 develops facts about n-player concurrent game structures with reachability objectives. The facts developed in this section will play a key role in the analysis of the later sections. 10

12 3.1 Markov decision processes In this section we develop some facts about one player versions of concurrent game structures, known as Markov decision processes (MDPs) [1]. For i 2 [n], a player i-mdp is a concurrent game structure where for all s 2 S, for all j 2 [n] nfig we have j j (s)j = 1, i.e., at every state only player i can choose between multiple moves and the choice for the other players are singleton. If for all states s 2 S, for all i 2 [n], j i (s)j = 1, then we have a Markov chain. Given a concurrent game structure G, ifwe fix a memoryless strategy profile ff i =(ff 1 ;:::;ff i 1 ;ff i+1 ;:::;ff n ) for players in [n] nfig, the game structure is equivalent to a player i-mdp G ff i with transition function ffi ff i (s; a i )(t) = X (a 1 ;a 2 ;:::;a i 1 ;a i+1 ;:::;a n) ffi(s; a 1 ;a 2 ;:::;a n )(t) Y j2([n]nf i g) ff j (s)(a j ); for all s 2 S and a i 2 i (s). Similarly, if we fix a memoryless strategy profile ff 2 ± M for a concurrent game structure G, we obtain a Markov chain, which we denote by G ff. In an MDP, the sets of states that play an equivalent role to the closed recurrent set of states in Markov chains [14] are called end components [5, 6]. Without loss of generality, we consider player 1-MDPs and since the set ± 1 is singleton for player 1-MDPs we only consider strategies for player 1. Definition 4 (End components and maximal end components) Given a player 1-MDP G, an end component (EC) in G is a subset C S such that there is a memoryless strategy ff 1 2 ± M 1 for player 1 under which C forms a closed recurrent set in the resulting Markov chain, i.e., in the Markov chain G ff1. Given a player 1-MDP G, an end component C is a maximal end component, if the following condition hold: if C Z and Z is an end component, then C = Z, i.e., there is no end component that encloses C. Graph of a MDP. Given a player 1-MDP G, the graph of G is a directed graph (S; E) with the set E of edges defined as follows: E = f (s; t) j s; t 2 S: 9a (s): t 2 Dest(s; a 1 ) g, i.e., E(s) = f t j (s; t) 2 E g denotes the set of possible successors of the state s in the MDP G. Equivalent characterization. An equivalent characterization of an end component C is as follows: for each s 2 C, there is a subset of moves M 1 (s) 1 (s) such that: 11

13 1. when a move inm 1 (s) ischosen at s, all the states that can be reached with non-zero probability are in C, i.e., for all s 2 C, for all a 2 M 1 (s), Dest(s; a) C; 2. the graph (C; E) is strongly connected, where E consists of the transitions that occur with non-zero probability when moves in M 1 ( ) are taken, i.e., E = f (s; t) j s; t 2 S: 9a 2 M 1 (s): t2 Dest(s; a) g. Given a set F 2 S of subset of states we denote by InfSt(F) the event f! j InfSt(!) 2Fg. The following lemma states that in a player 1-MDP, for all strategies of player 1, the set of states visited infinitely often is an end component with probability 1. Lemma 2 follows easily from Lemma 1. Lemma 1 ([6, 5]) Let C be the set of end components of a player 1-MDP G. For all strategies ff 1 2 ± 1 and all states s 2 S, we have Pr ff 1 s (InfSt(C)) = 1. Lemma 2 Let C be the set of end components and Z be the set of maximal end components of a player 1-MDP G. Then the following assertions hold: ffl L = S C2C C = S Z2Z Z; and ffl for all strategies ff 1 2 ± 1 and all states s 2 S, we have Pr ff 1 s ([Reach(L)]]) = 1. Lemma 3 Given a player 1-MDP G and an end component C, there is a uniform memoryless strategy ff 1 2 ± UM 1, such that for all states s 2 C, we have Pr ff 1 s (f! j Inf(!) =C g) =1. Proof. For a state s 2 C, let M 1 (s) 1 (s), be the subset of moves such that the conditions of the equivalent characterization of end components hold. Consider the uniform memoryless strategy ff 1 defined as follows: for all states s 2 C, ff 1 (s)(a) = ( 1 jm 1 (s)j ; if a 2 M 1(s) 0 otherwise: Given the strategy ff 1, in the Markov chain G ff1, the set C is a closed recurrent set of states. Hence the result follows. 12

14 3.2 Nash equilibrium for reachability objectives Memoryless Nash equilibrium in discounted games. We first prove the existence of Nash equilibrium in memoryless strategies in n-player discounted games with reachability objective [Reach(R i )]] for player i, for R i S. We then characterize "-Nash equilibrium in memoryless strategies with some special property in n-player discounted games with reachability objectives. Definition 5 (fi-discounted games) Given an n-player game structure G we write G fi to denote a fi-discounted version of the game structure G. The game G fi at each step halts with probability fi (goes to a special absorbing state halt such that halt is not in R i for all i), and continues as the game G with probability 1 fi. We refer to fi as the discount-factor. In this paper we write G fi to denote a fi-discounted game. Definition 6 (Stopping time of history in fi-discounted games) Consider the stopping time fi defined on histories h = hs 0 ;s 1 ;:::i by fi(h) = inffk 0 j s k = haltg where the infimum of the empty set is +1. Lemma 4 Let G fi beann-player fi-discounted game structure, with fi>0. Then, for all states s 2 S and all strategy profiles ff we have Pr ff s [fi >m]» (1 fi) m : Proof. At each step of the game G fi the game reaches the halt state with probability fi. Hence the probability of not reaching the halt state in m steps is at most (1 fi) m. The proof of the following lemma is similar to the proof of Lemma 2.2 of [19]. Lemma 5 For every n-player fi-discounted game structure G fi, with fi>0, with reachability objective [[Reach(R i )] for player i, there exist memoryless strategies ff i for i 2 [n], such that the memoryless strategy profile ff =(ff 1 ;ff 2 ;:::;ff n ) is a Nash equilibrium in G fi for every s 2 S. 13

15 Proof. Regard each n-tuple ff =(ff 1 ;ff 2 ;:::;ff n ) of memoryless strategies as a vector in a compact, convex subset K of the appropriate Eucledian space. Then define a correspondence that maps each element ff of K to the set (ff) of all elements g =(g 1 ;g 2 ;:::;g n )ofk such that, for all i 2 [n] and all s 2 S, g i is an optimal response for player i in G fi against ff i. Clearly, it suffices to show that there is a ff 2 K such that (ff) =ff. To show this, we will verify the Kakutani's Fixed Point Theorem [12]: 1. For every ff 2 K, (ff) is closed, convex and nonempty; 2. If, for g 1 ; g 2 ;:::;g (k) 2 (ff (k) ); lim k!1 g (k) = g and lim k!1 ff (k) = ff, then g 2 (ff). To verify condition 1, fix ff = (ff 1 ;ff 2 ;:::;ff n ) 2 K and i 2 f1; 2;:::;ng. For each s 2 S, let v fi i (s) be the maximal payoff that player i can achieve in G fi against ff i, i.e., v fi (s) = i val ig fiff i ([Reach(R i )])(s). Since fixing the strategies for all the other players the game structure becomes a MDP, we know that g i is an optimal response to ff i if and only if, for each s 2 S, g i (s) puts positive probability only on actions a i 2 A i that maximize the expectation of v fi i (s), namely, X v fi i (s0 )ffi ff i (s; a i )(s 0 ): s 0 The fact that any convex combination of optimal responses is again an optimal response in MDPs with reachability objectives follows from the fact that MDPs with reachability objectives can be solved by a linear program and the convex combination of optimal responses satisfy the constraints of the linear program with optimal values. Hence condition 1 follows. Condition 2 is an easy consequence of the continuity mapping ff! Pr ff s ([Reach(R i )]) from K to the real line. It follows from Lemma 4 that the mapping is continuous. The desired result follows. Definition 7 (Difference of two MDP's) Let G 1 and G 2 be two player i- MDPs defined on the same state space S with the same set A of moves. The difference of the two MDPs, denoted diff (G 1 ; G 2 ),isdefined as: diff (G 1 ; G 2 )= X s 0 2S X s2s X a2a jffi 1 (s; a)(s 0 ) ffi 2 (s; a)(s 0 )j: That is, diff (G 1 ; G 2 ) is the sum of the difference of the probabilities of all the edges of the MDPs. 14

16 Observe that in context of player i-mdps G, for all objectives Ψ i, we have val G (Ψ i i)(s) = sup ffi 2± i Pr ff i s (Ψ i ). The following lemma follows from Theorem (page 185) of Filar-Vrieze [9]. Lemma 6 (Lipschitz continuity for reachability objectives) Let G fi 1 and G fi 2 be two fi-discounted player i-mdps, for fi > 0, on the same state space and with the same set of moves. For j 2 f 1; 2 g, let v fi j (s) = G val fi j i ([[Reach(Ri )]])(s), for R i S, i.e., v fi j (s) denotes the value for player i for the reachability objective [Reach(R i )] in the fi-discounted MDP G fi j. Then the following assertion hold: jv fi 1 (s) vfi 2 (s)j»diff (Gfi 1 ; Gfi 2 ): Lemma 7 (Nash equilibrium with full support) Let G fi be an n- player fi-discounted game structure, with fi > 0, and reachability objective [[Reach(R i )] for player i. For every ">0, there exist memoryless strategies ff i for i 2 [n], such that for all i 2 [n], for all s 2 S, Supp(ff i (s)) = i (s) (i.e., all the moves of player i is played with positive probability), and the memoryless strategy profile ff =(ff 1 ;ff 2 ;:::;ff n ) is an "-Nash equilibrium in G fi for every s 2 S. Proof. Fix an Nash equilibrium ff 0 = (ff 0 1 ;ff0 2 ;:::;ff0 n) as obtained from Lemma 5. For all i 2 [n], define ff i (s)(a) = " jaj n j i (s)j n jsj + " 2 1 ff 0 2 jaj n j 1 (s)j n jsj i (s)(a); for all s 2 S and for all a 2 i (s). Let ff =(ff 1 ;ff 2 ;:::;ff n ). Note that for all s 2 S, we have Supp(ff i (s)) = i (s). For all i 2 [n], consider the two player i-mdps G fi ff i and G fi ff 0 i: for all s; s 0 2 S, and a i 2 i (s) wehave from the construction that: jffi ff i (s; a i )(s 0 ) ffi ff 0 i (s; a i )(s 0 " )j» jsj 2 j i (s)j. It follows that diff (G fi ff i ; G fi» ". Since ff ff i) 0 is a Nash equilibrium, and for all i 2 [n], 0 G fi ff i and G fi are fi-discounted player i-mdps with diff bounded by ", the ff 0 i result follows from Lemma 6. Lemma 8 ("-Nash equilibrium for reachability games [4]) For every n-player game structure G, with reachability objective [Reach(R i )]] for player i, for every ">0, there exists fi>0, such that a memoryless "-Nash equilibrium in G fi,isan 2"-Nash equilibrium in G. 15

17 Lemma 8 follows from the results in [4]. Lemma 7 and Lemma 8 yield Theorem 1. Theorem 1 ("-Nash equilibrium of full support) For every n-player game structure G, with reachability objective [Reach(R i )] for player i, for every " > 0, there exists a memoryless "-Nash equilibrium ff Λ = (ff Λ 1 ;ffλ 2 ;:::;ffλ n) such that for all s 2 S, for all i 2 [n], we have Supp(ff Λ i (s)) = i (s). 4 Nash Equilibrium for Upward-closed Objectives In this section we prove existence of memoryless "-Nash equilibrium, for all " > 0, for all n-player concurrent game structures, with upward-closed objectives for all players. The key arguments use induction on the number of players, the results of Section 3 and analysis of Markov chains and MDPs. We present some definitions required for the analysis of the rest of the section. MDP and graph of a game structure. Given an n-player concurrent game structure G, we define an associated MDP G of G and an associated graph of G. The MDP G =(S;A; ; ffi) is defined as follows: ffl S = S; A = A A ::: A = A n ; and (s) =f (a 1 ;a 2 ;:::;a n ) j a i 2 i (s) g. ffl ffi(s; (a 1 ;a 2 ;:::;a n )) = ffi(s; a 1 ;a 2 ;:::;a n ). The graph of the game structure G is defined as the graph of the MDP G. Games with absorbing states. Given a game structure G we partition the state space of G as follows: 1. The set of absorbing states in S are denoted as T, i.e., T = f s 2 C j s is an absorbing state g. 2. A set U of states that consists of states s such that j i (s)j = 1 for i 2 [n] and (U S) E U T. In other words, at states in U there is no non-trivial choice of moves for the players and thus for any state s in U the game proceeds to the set T according to the probability distribution of the transition function ffi at s. 3. C = S n (U [ T ). 16

18 Reachable sets. Given a game structure G and a state s 2 S, we define Reachable(s; G) = f t 2 S j there is a path from s to t in the graph of Gg as the set of states that are reachable from s in the graph of the game structure. For a set Z S, we denote by Reachable(Z; S G) the set of states reachable from a state in Z, i.e., Reachable(Z; G) = s2z Reachable(s; G). Given a set Z, let Z R = Reachable(Z; G). We denote by G μ Z R, the subgame induced by the set Z R of states. Similarly, given a set F 2 S, we denote by F μ Z R the set f U j9f 2F:U= F Z R g. Terminal non-absorbing maximal end components (Tnec). Given a game structure G, let Z be the set of maximal end components of G. Let L = ZnT S be the set of maximal non-absorbing end components and let H = L2L L. A maximal end component Z C, is a terminal nonabsorbing maximal end component(tnec), if Reachable(Z; G) (H nz) =;, i.e., no other non-absorbing maximal end component is reachable from Z. We consider in this section game structures G with upward-closed objective[[upclo(uc i )]] for player i. We also denote by R i = ffsg 2T j s 2 UC i g the set of the absorbing states in T that are in UC i. We now prove the following key result. Theorem 2 For all n-player concurrent game structures G, with upwardclosed objective [[UpClo(UC i )] for player i, one of the following conditions (condition C1 or C2) hold: 1. (Condition C1) There exists a memoryless strategy profile ff 2 ± M such that in the Markov chain G ff there is closed recurrent set Z C, such that ff is a Nash equilibrium for all states s 2 Z. 2. (Condition C2) There exists a state s 2 C, such that for all " > 0, there exists a memoryless "-Nash equilibrium ff 2 ± M for state s, such that Pr ff s ([[Reach(T )]) = 1, and for all s 2 S, and for all i 2 [n], we have Supp(ff i (s)) = i (s). The proof of Theorem 2 is by induction on the number of players. We first analyze the base case. Base Case. (One player game structures or MDPs) We consider player 1- MDPs and analyze the following cases: ffl (Case 1.) If there in no Tnec in C, then it follows from Lemma 2 that for all states s 2 C, for all strategies ff 1 2 ± 1, we have Pr ff 1 s ([[Reach(T )]]) = 1, and Pr ff 1 s ([[Reach(R 1 )]) = Pr ff 1 s ([UpClo(UC 1 )]) (recall R 1 = ffsg2t j s 2 UC 1 g). The result 17

19 of Theorem 1 yields an "-Nash equilibrium ff 1 that satisfies condition C2 of Theorem 2, for all states s 2 C. ffl (Case 2.) Else let Z C be a Tnec. 1. If Z 2 UC 1, fix a uniform memoryless strategy ff 1 2 ± UM 1 such that for all s 2 Z, Pr ff 1 s (f! j Inf(!) = Z g) = 1 and Pr ff 1 s ([[UpClo(UC 1 )]) = 1 (such a strategy exists by Lemma 3, since C is an end component). In other words, Z is a closed recurrent set in the Markov chain G ff1 and the objective of player 1 is satisfied with probability 1. Hence condition C1 of Theorem 2 is satisfied. 2. If Z 62 UC 1, then since UC 1 is upward-closed, for all set Z 1 Z, Z 1 62 UC 1. Hence for any play!, such that! 2 [Safe(Z)]], we have Inf(!) Z, and hence! 62 [UpClo(UC 1 )]. Hence we have for all states s 2 Z, sup Pr ff 1 s ([UpClo(UC 1 )]) = sup Pr ff 1 s ([Reach(R 1 )]): ff 1 2± 1 ff 1 2± 1 If the set of edges from Z to U [ T is empty, then for all strategies ff 1 we have Pr ff 1 s ([UpClo(UC 1 )]) = 0, and hence any uniform memoryless strategy can be fixed and condition C1 of Theorem 2 can be satisfied. Otherwise, the set of edges from Z to U [ T is non-empty, and then for " > 0, consider an "-Nash equilibrium for reachability objective [[Reach(R 1 )]] satisfying the conditions of Theorem 1. Since Z is an end component, for all states s 2 Z, Supp(ff 1 (s)) = 1 (s), and the set of edges to Z to U [ T is non-empty it follows that for all states s 2 Z, we have Pr ff 1 s ([Reach(T )]]) = 1. Thus condition C2 of Theorem 2 is satisfied. We prove the following lemma, that will be useful for the analysis of the inductive case. Lemma 9 Consider a player i-mdp G with an upward-closed objective [[UpClo(UC i )]] for player i. Let ff i 2 ± M i be a memoryless strategy such that for all s 2 S, we have Supp(ff i (s)) = i (s). Let Z S be a closed recurrent set in the Markov chain G ffi. Then ff i is a Nash equilibrium for all states s 2 Z. Proof. The proof follows from the analysis of two cases. 18

20 1. If Z 2 UC i, then since Z is a closed recurrent set in G ffi, for all states s 2 S we have Pr ff i s (f! j Inf(!) = Z g) = 1. Hence we have Pr ff i s ([[UpClo(UC i )]])=1 sup ff 0 2± Prff0 i s ([UpClo(UC i i i )]). The result follows. 2. We now consider the case such that Z 62 UC i. Since for all s 2 Z, we have Supp(ff i (s)) = i (s), it follows that for all strategies ff 0 i 2 ± i and for all s 2 Z, we have Outcome(s; ff 0 i ) Outcome(s; ff i) [[Safe(Z)] (since Z is a closed recurrent set in G ffi ). It follows that for all strategies ff 0 i we have Prff0 i s ([[Safe(Z)]]) = 1. Hence for all strategies ff 0 i, for all states s 2 Z we have Pr ff0 i s (f! j Inf(!) Z g) =1. Since Z 62 UC i, and UC i is upward-closed, it follows that for all strategies ff 0 i, for all states s 2 Z we have Pr ff0 i s ([[UpClo(UC i )])=0. Hence for all states s 2 Z, we have sup ff 0 2± Prff0 i s ([[UpClo(UC i i i )])=0=Pr ff i s ([UpClo(UC i )]). Thus the result follows. Inductive case. Given a game structure G, consider the MDP G: if there are no Tnec in C, then the result follows from analysis similar to Case 1 of the base case. Otherwise consider a Tnec Z C in G. If for every player i we have Z 2 UC i, then fix a uniform memoryless strategy ff 2 ± UM such that for all s 2 Z, Pr ff s (f! j Inf(!) = Z g) = 1 (such a strategy exists by Lemma 3, since C is an end component ing). Hence, for all i 2 [n] wehave Pr ff s ([[UpClo(UC i )]]) = 1. That is Z is a closed recurrent set in the Markov chain G ff and the objective ofeach player is satisfied with probability 1 from all states s 2 Z. Hence condition C1 of Theorem 2 is satisfied. Otherwise, there exists i 2 [n], such that Z 62 UC i, and without loss of generality we assume that this holds for player 1, i.e., Z 62 UC 1. If Z 62 UC 1, then we prove Lemma 10 to prove Theorem 2. Lemma 10 Consider an n-player concurrent game structure G, with upward-closed objective [UpClo(UC i )] for player i. Let Z be a Tnec in G such that Z 62 UC 1 and let Z R = Reachable(Z; G). The following assertions hold: 1. If there exists ff 1 2 ± M, such that for all s 2 Z, Supp(ff 1 1(s)) = 1 (s), and condition C1 of Theorem 2 holds in G ff1 μ Z R, then condition C1 Theorem 2 holds in G. 2. Otherwise, condition C2 of Theorem 2 holds in G. 19

21 Proof. Given a memoryless strategy ff 1, fixing the strategy ff 1 for player 1, we get an n 1-player game structure and by inductive hypothesis either condition C1 or C2 of Theorem 2 holds. ffl Case 1. If there is a memoryless strategy ff 1 2 ± M 1, such that for all s 2 Z, Supp(ff 1 (s)) = 1 (s), and condition C1 of Theorem 2 holds in G ff1, then let ff 1 =(ff 2 ;ff 3 ;:::;ff n ) be the memoryless Nash equilibrium and Z 1 Z be the closed recurrent set in G ff 1 [ff 1 satisfying the condition C1 of Theorem 2 in G ff1. Observe that (ff 1 ;Z 1 ) satisfy the conditions of Lemma 9 in the MDP G ff i. Hence, an application of Lemma 9 yields that ff 1 is a Nash equilibrium for all states s 2 Z 1, in the MDP G ff 1. Since ff 1 is a Nash equilibrium for all states in Z 1 in G ff1, it follows that ff = ff 1 [ ff 1 and Z 1 satisfy condition C1 of Theorem 2. ffl For " > 0, consider a memoryless "-Nash equilibrium ff = (ff 1 ;ff 2 ;:::;ff n )ing with objective [Reach(R i )] for player i, such that for all s 2 S, for all i 2 [n], we have Supp(ff i (s)) = i (s) (such an "-Nash equilibrium exists from Theorem 1). We now prove the desired result analyzing two sub-cases: 1. Suppose there exits j 2 [n], and Z j Z, such that Z j 2 UC j, and Z j is an end component in G ff j, then let ff 0 j be a memoryless strategy for player j, such that Z j is a closed recurrent set of states in the Markov chain G ff j [ff 0. Let ff 0 = ff j j [ ff 0 j. Since Z j 2 UC j, it follows that for all states s 2 Z j, we have Pr ff0 s ([[UpClo(UC j )]) = 1, and hence player j has no incentive to deviate from ff 0. Since for all ff i, for i 6= j, and for all states s 2 S, wehave Supp(ff i )(s) = i (s), and Z j is a closed recurrent set in G ff 0, it follows from Lemma 9 that for all j 6= i, ff i is a Nash equilibrium in G ff 0 i. Hence we have ff 0 is a Nash equilibrium for all states s 2 Z j in G and condition C1 of Theorem 2 is satisfied. 2. Hence it follows that if Case 1 fails, for all i 2 [n], all end components Z i Z, in G ff i, we have Z i 62 UC i. Hence for all i 2 [n], for all s 2 Z, for all ff 0 i 2 ± i, Pr ff i[ff 0 i s ([UpClo(UC i )]) = Pr ff i[ff 0 i s ([Reach(R i )]]). Since ff is an "-Nash equilibrium with objectives [Reach(R i )] for player i in G, it follows that ff is an "-Nash equilibrium in G with objectives [[UpClo(UC i )]] for player i. Moreover, if there is an closed recurrent set Z 0 Z in the Markov chain G ff, then case 1would have been true (follows 20

22 from Lemma 9). Hence if case 1 fails, then it follows that there is no closed recurrent set Z 0 Z in G ff, and hence for all states s 2 Z, we have Pr ff s ([Reach(T )]]) = 1. Hence condition C2 of Theorem 2 holds, and the desired result is established. Inductive application of Theorem 2. Given a game structure G, with upward-closed objective [[UpClo(UC i )] for player i, to prove existence of "-Nash equilibrium for all states s 2 S, for " > 0, we apply Theorem 2 recursively. We convert the game structure G to a game structure G 0 as follows: 1. Transformation 1. If condition C1 of Theorem 2 holds, then let Z be the closed recurrent set that satisfy the condition C1 of Theorem 2. ffl In G 0 convert every state s 2 Z to an absorbing state; ffl if Z 62 UC i, for player i, then the objective for player i in G 0 is UC i ; ffl if Z 2 UC i for player i, the objective for player i in G is modified to include every state s 2 Z, i.e., for all Q S, ifs 2 Q, for some s 2 Z, then Q is included in UC i. Observe that the states in Z are converted to absorbing states and will be interpreted as states in T in G Transformation 2. If condition C2 of Theorem 2 holds, then let ff Λ be an jsj " -Nash equilibrium from state s, such that PrffΛ s ([[Reach(T )]])=1. The state is converted as follows: for all i 2 [n], the available moves for player i at s is reduced to 1, i.e., for all i 2 [n], i (s) =f a i g, and the transition function ffi 0 in G 0 at s is defined as: ( Pr ff Λ s ([Reach(t)]) if t 2 T ffi(s; a 1 ;a 2 ;:::;a n )(t) = 0 otherwise: Note that the state s can be interpreted as a state in U in G 0. To obtain a "-Nash equilibrium for all states s 2 S in G, it suffices to obtain an "-Nash equilibrium for all states in G 0. Also observe that for all states in U [ T, Nash equilibrium exists by definition. Applying the transformations recursively on G 0,we proceed to convert every state to a state in U [ T, and the desired result follows. This yields Theorem 3. 21

23 Theorem 3 For all n-player concurrent game structures G, with upwardclosed objective [UpClo(UC i )]] for player i, for all " > 0, for all states s 2 S, there exists a memoryless strategy profile ff Λ, such that ff Λ is an "-Nash equilibrium for state s. Remark 1 It may be noted that upward-closed objectives are not closed under complementation, and hence Theorem 3 is not a generalization of determinacy result for concurrent zero-sum games with upward-closed objective for one player. For example in concurrent zero-sum games with Büchi objective for a player, "-optimal strategies require infinite-memory in general, but the complementary objective of a Büchi objective is not upward-closed (recall Example 1). In contrast, we show the existence of memoryless "- Nash equilibrium for n-player concurrent games where each player has an upward-closed objective. For the special case of zero-sum turn-based games, with upward-closed objective for a player, existence of memoryless optimal strategies was proved in [3]; however note that the memoryless strategies require randomization as pure or deterministic strategies require memory even for turn-based games with generalized Büchi objectives. 5 Computational Complexity In this section we present an algorithm to compute an "-Nash equilibrium for n-player game structures with upward-closed objectives, for " > 0. A key result for the algorithmic analysis is Lemma 11. Lemma 11 Consider an n-player concurrent game structure G, with upward-closed objective [UpClo(UC i )] for player i. Let Z be a Tnec in G such that Z 62 UC n and let Z R = Reachable(Z; G). The following assertion hold: 1. Suppose there exists ff n 2 ± M n, such that for all s 2 Z, Supp(ff n(s)) = n (s), and condition C1 of Theorem 2 holds in G ffn μ Z R. Let ff Λ n 2 ± UM n such that for all s 2 Z we have Supp(ffn(s)) Λ = n (s) (i.e., ff Λ n is a uniform memoryless strategy that plays all available moves at all states in Z). Then condition C1 holds in G ff Λ n μ Z R. Proof. The result follows from the observation that for any strategy profile ff n 2 ± M n, the closed recurrent set of states in G ff n[ff n and G ff n[ff Λ n are the same. Lemma 11 presents the basic principle to identify if condition C1 of Theorem 2 holds in a game structure G with upward-closed objective 22

24 Algorithm 1 UpCloCondC1 Input : An n-player game structure G, with upward-closed objective [UpClo(UC i )]] for player i, for all i 2 [n]. Output: Either (Z; ff) satisfying condition C1 of Theorem 2 or else (;; ;). 1. if n =0, 1.1 if there is a non-absorbing closed recurrent set Z in the Markov chain G, return (Z; ;). 1.2 else return (;; ;). 2. Z =ComputeMaximalEC(G) (i.e., Z is the set of maximal end components in the MDP of G). 3. if there is no Tnec in G, return (;; ;). 4. if there exists Z 2Z such that for all i 2 [n], Z 2 UC i, 4.1. return (Z; ff) such that ff 2 ± UM and Z is closed recurrent set in G ff. 5. Let Z be a Tnec in G, and let Z R = Reachable(Z; G). 6. else without loss of generality let Z 62 UC n Let ff n 2 ± UM n such that for all states s 2 Z R, ff n (s) = n (s) (Z 1 ; ff) =UpCloCondC1 (G ffn μ Z R ;n 1; [UpClo(UC i μ Z R )] for i 2 [n 1]) 6.3. if (Z 1 = ;) return (;; ;); 6.4. else return (Z 1 ; ff n [ ff n ). [UpClo(UC i )]] for player i. An informal description of the algorithm (Algorithm 1) is as follows: the algorithm takes as input a game structure G of n-players, objectives [UpClo(UC i )] for player i, and it either returns (Z; ff) 2 S ± M satisfying the condition C1 of Theorem 2 or returns (;; ;). Let G be the MDP of G, and let Z be the set of maximal end components in G (computed in Step 2 of Algorithm 1). If there is no Tnec in G, then condition C1 of Theorem 2 fails and (;; ;) is returned (Step 3 of Algorithm 1). If there is a maximal end component Z 2 Z such that for all i 2 [n], Z 2 UC i, then fix a uniform memoryless strategy ff 2 ± UM such that Z is a closed recurrent set in G ff and return (Z; ff) (Step 4 of Algorithm 1). Else let Z be a Tnec and without of loss of generality let Z 62 UC n. Let Z R = Reachable(Z; G), and fix a strategy ff n 2 ± UM n, such that for all s 2 Z R, Supp(ff n (s)) = n (s). The n 1-player game structure G ffn μ Z R is solved by an recursive call (Step 6.3) and the result of the recursive call is returned. It follows from Lemma 11 that if Algorithm 1 23

A Survey of Stochastic ω-regular Games

A Survey of Stochastic ω-regular Games A Survey of Stochastic ω-regular Games Krishnendu Chatterjee Thomas A. Henzinger EECS, University of California, Berkeley, USA Computer and Communication Sciences, EPFL, Switzerland {c krish,tah}@eecs.berkeley.edu

More information

A Survey of Partial-Observation Stochastic Parity Games

A Survey of Partial-Observation Stochastic Parity Games Noname manuscript No. (will be inserted by the editor) A Survey of Partial-Observation Stochastic Parity Games Krishnendu Chatterjee Laurent Doyen Thomas A. Henzinger the date of receipt and acceptance

More information

The Complexity of Stochastic Müller Games

The Complexity of Stochastic Müller Games The Complexity of Stochastic Müller Games Krishnendu Chatterjee Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2007-110 http://www.eecs.berkeley.edu/pubs/techrpts/2007/eecs-2007-110.html

More information

Finitary Winning in \omega-regular Games

Finitary Winning in \omega-regular Games Finitary Winning in \omega-regular Games Krishnendu Chatterjee Thomas A. Henzinger Florian Horn Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2007-120

More information

Report Documentation Page

Report Documentation Page On Nash Equilibria in Stochastic Games Krishnendu Chatterjee Marcin Jurdziński Rupak Majumdar Report No. UCB/CSD-3-28 October 2003 Computer Science Division (EECS) University of California Berkeley, California

More information

The Complexity of Ergodic Mean-payoff Games,

The Complexity of Ergodic Mean-payoff Games, The Complexity of Ergodic Mean-payoff Games, Krishnendu Chatterjee Rasmus Ibsen-Jensen Abstract We study two-player (zero-sum) concurrent mean-payoff games played on a finite-state graph. We focus on the

More information

Markov Decision Processes with Multiple Long-run Average Objectives

Markov Decision Processes with Multiple Long-run Average Objectives Markov Decision Processes with Multiple Long-run Average Objectives Krishnendu Chatterjee UC Berkeley c krish@eecs.berkeley.edu Abstract. We consider Markov decision processes (MDPs) with multiple long-run

More information

Finitary Winning in ω-regular Games

Finitary Winning in ω-regular Games Finitary Winning in ω-regular Games Krishnendu Chatterjee 1 and Thomas A. Henzinger 1,2 1 University of California, Berkeley, USA 2 EPFL, Switzerland {c krish,tah}@eecs.berkeley.edu Abstract. Games on

More information

Qualitative Concurrent Parity Games

Qualitative Concurrent Parity Games Qualitative Concurrent Parity Games Krishnendu Chatterjee Luca de Alfaro Thomas A Henzinger CE, University of California, Santa Cruz,USA EECS, University of California, Berkeley,USA Computer and Communication

More information

Randomness for Free. 1 Introduction. Krishnendu Chatterjee 1, Laurent Doyen 2, Hugo Gimbert 3, and Thomas A. Henzinger 1

Randomness for Free. 1 Introduction. Krishnendu Chatterjee 1, Laurent Doyen 2, Hugo Gimbert 3, and Thomas A. Henzinger 1 Randomness for Free Krishnendu Chatterjee 1, Laurent Doyen 2, Hugo Gimbert 3, and Thomas A. Henzinger 1 1 IST Austria (Institute of Science and Technology Austria) 2 LSV, ENS Cachan & CNRS, France 3 LaBri

More information

Infinite Games. Sumit Nain. 28 January Slides Credit: Barbara Jobstmann (CNRS/Verimag) Department of Computer Science Rice University

Infinite Games. Sumit Nain. 28 January Slides Credit: Barbara Jobstmann (CNRS/Verimag) Department of Computer Science Rice University Infinite Games Sumit Nain Department of Computer Science Rice University 28 January 2013 Slides Credit: Barbara Jobstmann (CNRS/Verimag) Motivation Abstract games are of fundamental importance in mathematics

More information

Value Iteration. 1 Introduction. Krishnendu Chatterjee 1 and Thomas A. Henzinger 1,2

Value Iteration. 1 Introduction. Krishnendu Chatterjee 1 and Thomas A. Henzinger 1,2 Value Iteration Krishnendu Chatterjee 1 and Thomas A. Henzinger 1,2 1 University of California, Berkeley 2 EPFL, Switzerland Abstract. We survey value iteration algorithms on graphs. Such algorithms can

More information

Generalized Parity Games

Generalized Parity Games Generalized Parity Games Krishnendu Chatterjee 1, Thomas A. Henzinger 1,2, and Nir Piterman 2 1 University of California, Berkeley, USA 2 EPFL, Switzerland c krish@eecs.berkeley.edu, {tah,nir.piterman}@epfl.ch

More information

Generalized Parity Games

Generalized Parity Games Generalized Parity Games Krishnendu Chatterjee Thomas A. Henzinger Nir Piterman Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2006-144

More information

Quantitative Languages

Quantitative Languages Quantitative Languages Krishnendu Chatterjee 1, Laurent Doyen 2, and Thomas A. Henzinger 2 1 University of California, Santa Cruz 2 EPFL, Lausanne, Switzerland Abstract. Quantitative generalizations of

More information

Lecture notes for Analysis of Algorithms : Markov decision processes

Lecture notes for Analysis of Algorithms : Markov decision processes Lecture notes for Analysis of Algorithms : Markov decision processes Lecturer: Thomas Dueholm Hansen June 6, 013 Abstract We give an introduction to infinite-horizon Markov decision processes (MDPs) with

More information

Complexity Bounds for Muller Games 1

Complexity Bounds for Muller Games 1 Complexity Bounds for Muller Games 1 Paul Hunter a, Anuj Dawar b a Oxford University Computing Laboratory, UK b University of Cambridge Computer Laboratory, UK Abstract We consider the complexity of infinite

More information

On Strong Determinacy of Countable Stochastic Games

On Strong Determinacy of Countable Stochastic Games On Strong Determinacy of Countable Stochastic Games Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Dominik Wojtczak University of Oxford, UK University of Edinburgh, UK University of Liverpool, UK arxiv:70405003v

More information

Theoretical Computer Science

Theoretical Computer Science Theoretical Computer Science 458 (2012) 49 60 Contents lists available at SciVerse ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Energy parity games Krishnendu

More information

Stochastic Games with Time The value Min strategies Max strategies Determinacy Finite-state games Cont.-time Markov chains

Stochastic Games with Time The value Min strategies Max strategies Determinacy Finite-state games Cont.-time Markov chains Games with Time Finite-state Masaryk University Brno GASICS 00 /39 Outline Finite-state stochastic processes. Games over event-driven stochastic processes. Strategies,, determinacy. Existing results for

More information

Synthesis weakness of standard approach. Rational Synthesis

Synthesis weakness of standard approach. Rational Synthesis 1 Synthesis weakness of standard approach Rational Synthesis 3 Overview Introduction to formal verification Reactive systems Verification Synthesis Introduction to Formal Verification of Reactive Systems

More information

Perfect-information Stochastic Parity Games

Perfect-information Stochastic Parity Games Perfect-information Stochastic Parity Games Wies law Zielonka LIAFA, case 7014 Université Paris 7 2, Place Jussieu 75251 Paris Cedex 05, France zielonka@liafa.jussieu.fr Abstract. We show that in perfect-information

More information

Mean-Payoff Games and the Max-Atom Problem

Mean-Payoff Games and the Max-Atom Problem Mean-Payoff Games and the Max-Atom Problem Albert Atserias Universitat Politècnica de Catalunya Barcelona, Spain Elitza Maneva Universitat Politècnica de Catalunya Barcelona, Spain February 3, 200 Abstract

More information

Quantitative Solution of Omega-Regular Games

Quantitative Solution of Omega-Regular Games Quantitative Solution of Omega-Regular Games Luca de Alfaro Department of EECS UC Berkeley Berkeley, CA 9470, USA dealfaro@eecs.berkeley.edu Rupak Majumdar Department of EECS UC Berkeley Berkeley, CA 9470,

More information

Chapter 9. Mixed Extensions. 9.1 Mixed strategies

Chapter 9. Mixed Extensions. 9.1 Mixed strategies Chapter 9 Mixed Extensions We now study a special case of infinite strategic games that are obtained in a canonic way from the finite games, by allowing mixed strategies. Below [0, 1] stands for the real

More information

Maths 212: Homework Solutions

Maths 212: Homework Solutions Maths 212: Homework Solutions 1. The definition of A ensures that x π for all x A, so π is an upper bound of A. To show it is the least upper bound, suppose x < π and consider two cases. If x < 1, then

More information

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden 1 Selecting Efficient Correlated Equilibria Through Distributed Learning Jason R. Marden Abstract A learning rule is completely uncoupled if each player s behavior is conditioned only on his own realized

More information

Basic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria

Basic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria Basic Game Theory University of Waterloo January 7, 2013 Outline 1 2 3 What is game theory? The study of games! Bluffing in poker What move to make in chess How to play Rock-Scissors-Paper Also study of

More information

MORE ON CONTINUOUS FUNCTIONS AND SETS

MORE ON CONTINUOUS FUNCTIONS AND SETS Chapter 6 MORE ON CONTINUOUS FUNCTIONS AND SETS This chapter can be considered enrichment material containing also several more advanced topics and may be skipped in its entirety. You can proceed directly

More information

Quitting games - An Example

Quitting games - An Example Quitting games - An Example E. Solan 1 and N. Vieille 2 January 22, 2001 Abstract Quitting games are n-player sequential games in which, at any stage, each player has the choice between continuing and

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

Subgame-Perfect Equilibria for Stochastic Games

Subgame-Perfect Equilibria for Stochastic Games MATHEMATICS OF OPERATIONS RESEARCH Vol. 32, No. 3, August 2007, pp. 711 722 issn 0364-765X eissn 1526-5471 07 3203 0711 informs doi 10.1287/moor.1070.0264 2007 INFORMS Subgame-Perfect Equilibria for Stochastic

More information

Solving Partial-Information Stochastic Parity Games

Solving Partial-Information Stochastic Parity Games Solving Partial-Information Stochastic Parity ames Sumit Nain and Moshe Y. Vardi Department of Computer Science, Rice University, Houston, Texas, 77005 Email: {nain,vardi}@cs.rice.edu Abstract We study

More information

Logic and Games SS 2009

Logic and Games SS 2009 Logic and Games SS 2009 Prof. Dr. Erich Grädel Łukasz Kaiser, Tobias Ganzow Mathematische Grundlagen der Informatik RWTH Aachen c b n d This work is licensed under: http://creativecommons.org/licenses/by-nc-nd/3.0/de/

More information

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA address:

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA  address: Topology Xiaolong Han Department of Mathematics, California State University, Northridge, CA 91330, USA E-mail address: Xiaolong.Han@csun.edu Remark. You are entitled to a reward of 1 point toward a homework

More information

Algorithmic Game Theory and Applications. Lecture 15: a brief taster of Markov Decision Processes and Stochastic Games

Algorithmic Game Theory and Applications. Lecture 15: a brief taster of Markov Decision Processes and Stochastic Games Algorithmic Game Theory and Applications Lecture 15: a brief taster of Markov Decision Processes and Stochastic Games Kousha Etessami warning 1 The subjects we will touch on today are so interesting and

More information

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3 Index Page 1 Topology 2 1.1 Definition of a topology 2 1.2 Basis (Base) of a topology 2 1.3 The subspace topology & the product topology on X Y 3 1.4 Basic topology concepts: limit points, closed sets,

More information

Controlling probabilistic systems under partial observation an automata and verification perspective

Controlling probabilistic systems under partial observation an automata and verification perspective Controlling probabilistic systems under partial observation an automata and verification perspective Nathalie Bertrand, Inria Rennes, France Uncertainty in Computation Workshop October 4th 2016, Simons

More information

Noncooperative continuous-time Markov games

Noncooperative continuous-time Markov games Morfismos, Vol. 9, No. 1, 2005, pp. 39 54 Noncooperative continuous-time Markov games Héctor Jasso-Fuentes Abstract This work concerns noncooperative continuous-time Markov games with Polish state and

More information

A Polynomial-time Nash Equilibrium Algorithm for Repeated Games

A Polynomial-time Nash Equilibrium Algorithm for Repeated Games A Polynomial-time Nash Equilibrium Algorithm for Repeated Games Michael L. Littman mlittman@cs.rutgers.edu Rutgers University Peter Stone pstone@cs.utexas.edu The University of Texas at Austin Main Result

More information

Microeconomics. 2. Game Theory

Microeconomics. 2. Game Theory Microeconomics 2. Game Theory Alex Gershkov http://www.econ2.uni-bonn.de/gershkov/gershkov.htm 18. November 2008 1 / 36 Dynamic games Time permitting we will cover 2.a Describing a game in extensive form

More information

Nash-solvable bidirected cyclic two-person game forms

Nash-solvable bidirected cyclic two-person game forms DIMACS Technical Report 2008-13 November 2008 Nash-solvable bidirected cyclic two-person game forms by Endre Boros 1 RUTCOR, Rutgers University 640 Bartholomew Road, Piscataway NJ 08854-8003 boros@rutcor.rutgers.edu

More information

Approximation Metrics for Discrete and Continuous Systems

Approximation Metrics for Discrete and Continuous Systems University of Pennsylvania ScholarlyCommons Departmental Papers (CIS) Department of Computer & Information Science May 2007 Approximation Metrics for Discrete Continuous Systems Antoine Girard University

More information

LECTURE 15: COMPLETENESS AND CONVEXITY

LECTURE 15: COMPLETENESS AND CONVEXITY LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other

More information

Antichain Algorithms for Finite Automata

Antichain Algorithms for Finite Automata Antichain Algorithms for Finite Automata Laurent Doyen 1 and Jean-François Raskin 2 1 LSV, ENS Cachan & CNRS, France 2 U.L.B., Université Libre de Bruxelles, Belgium Abstract. We present a general theory

More information

Solution Concepts and Algorithms for Infinite Multiplayer Games

Solution Concepts and Algorithms for Infinite Multiplayer Games Solution Concepts and Algorithms for Infinite Multiplayer Games Erich Grädel and Michael Ummels Mathematische Grundlagen der Informatik, RWTH Aachen, Germany E-Mail: {graedel,ummels}@logic.rwth-aachen.de

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo September 6, 2011 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

Non-reactive strategies in decision-form games

Non-reactive strategies in decision-form games MPRA Munich Personal RePEc Archive Non-reactive strategies in decision-form games David Carfì and Angela Ricciardello University of Messina, University of California at Riverside 2009 Online at http://mpra.ub.uni-muenchen.de/29262/

More information

Infinite-Duration Bidding Games

Infinite-Duration Bidding Games Infinite-Duration Bidding Games Guy Avni 1, Thomas A. Henzinger 2, and Ventsislav Chonev 3 1 IST Austria, Klosterneuburg, Austria 2 IST Austria, Klosterneuburg, Austria 3 Max Planck Institute for Software

More information

Economics 201B Economic Theory (Spring 2017) Bargaining. Topics: the axiomatic approach (OR 15) and the strategic approach (OR 7).

Economics 201B Economic Theory (Spring 2017) Bargaining. Topics: the axiomatic approach (OR 15) and the strategic approach (OR 7). Economics 201B Economic Theory (Spring 2017) Bargaining Topics: the axiomatic approach (OR 15) and the strategic approach (OR 7). The axiomatic approach (OR 15) Nash s (1950) work is the starting point

More information

Solving Extensive Form Games

Solving Extensive Form Games Chapter 8 Solving Extensive Form Games 8.1 The Extensive Form of a Game The extensive form of a game contains the following information: (1) the set of players (2) the order of moves (that is, who moves

More information

Multiagent Systems and Games

Multiagent Systems and Games Multiagent Systems and Games Rodica Condurache Lecture 5 Lecture 5 Multiagent Systems and Games 1 / 31 Multiagent Systems Definition A Multiagent System is a tuple M = AP, Ag, (Act i ) i Ag, V, v 0, τ,

More information

Recursive definitions on surreal numbers

Recursive definitions on surreal numbers Recursive definitions on surreal numbers Antongiulio Fornasiero 19th July 2005 Abstract Let No be Conway s class of surreal numbers. I will make explicit the notion of a function f on No recursively defined

More information

arxiv: v1 [math.lo] 1 Jun 2011

arxiv: v1 [math.lo] 1 Jun 2011 HIGHLY LOPSIDED INFORMATION AND THE BOREL HIERARCHY arxiv:1106.0196v1 [math.lo] 1 Jun 2011 SAMUEL A. ALEXANDER 1 Abstract. In a game where both contestants have perfect information, there is a strict limit

More information

On the complexity of infinite computations

On the complexity of infinite computations On the complexity of infinite computations Damian Niwiński, Warsaw University joint work with Igor Walukiewicz and Filip Murlak Newton Institute, Cambridge, June 2006 1 Infinite computations Büchi (1960)

More information

Löwenheim-Skolem Theorems, Countable Approximations, and L ω. David W. Kueker (Lecture Notes, Fall 2007)

Löwenheim-Skolem Theorems, Countable Approximations, and L ω. David W. Kueker (Lecture Notes, Fall 2007) Löwenheim-Skolem Theorems, Countable Approximations, and L ω 0. Introduction David W. Kueker (Lecture Notes, Fall 2007) In its simplest form the Löwenheim-Skolem Theorem for L ω1 ω states that if σ L ω1

More information

Game Theory and its Applications to Networks - Part I: Strict Competition

Game Theory and its Applications to Networks - Part I: Strict Competition Game Theory and its Applications to Networks - Part I: Strict Competition Corinne Touati Master ENS Lyon, Fall 200 What is Game Theory and what is it for? Definition (Roger Myerson, Game Theory, Analysis

More information

Perturbations of Markov Chains with Applications to Stochastic Games

Perturbations of Markov Chains with Applications to Stochastic Games Perturbations of Markov Chains with Applications to Stochastic Games Eilon Solan June 14, 2000 In this lecture we will review several topics that are extensively used in the study of n-player stochastic

More information

Notes on induction proofs and recursive definitions

Notes on induction proofs and recursive definitions Notes on induction proofs and recursive definitions James Aspnes December 13, 2010 1 Simple induction Most of the proof techniques we ve talked about so far are only really useful for proving a property

More information

Game Theory and Algorithms Lecture 7: PPAD and Fixed-Point Theorems

Game Theory and Algorithms Lecture 7: PPAD and Fixed-Point Theorems Game Theory and Algorithms Lecture 7: PPAD and Fixed-Point Theorems March 17, 2011 Summary: The ultimate goal of this lecture is to finally prove Nash s theorem. First, we introduce and prove Sperner s

More information

1 Extensive Form Games

1 Extensive Form Games 1 Extensive Form Games De nition 1 A nite extensive form game is am object K = fn; (T ) ; P; A; H; u; g where: N = f0; 1; :::; ng is the set of agents (player 0 is nature ) (T ) is the game tree P is the

More information

Introduction. Büchi Automata and Model Checking. Outline. Büchi Automata. The simplest computation model for infinite behaviors is the

Introduction. Büchi Automata and Model Checking. Outline. Büchi Automata. The simplest computation model for infinite behaviors is the Introduction Büchi Automata and Model Checking Yih-Kuen Tsay Department of Information Management National Taiwan University FLOLAC 2009 The simplest computation model for finite behaviors is the finite

More information

Memoryle Strategie in Concurrent Game with Reachability Objective Λ Krihnendu Chatterjee y Luca de Alfaro x Thoma A. Henzinger y;z y EECS, Univerity o

Memoryle Strategie in Concurrent Game with Reachability Objective Λ Krihnendu Chatterjee y Luca de Alfaro x Thoma A. Henzinger y;z y EECS, Univerity o Memoryle Strategie in Concurrent Game with Reachability Objective Krihnendu Chatterjee, Luca de Alfaro and Thoma A. Henzinger Report No. UCB/CSD-5-1406 Augut 2005 Computer Science Diviion (EECS) Univerity

More information

HW 4 SOLUTIONS. , x + x x 1 ) 2

HW 4 SOLUTIONS. , x + x x 1 ) 2 HW 4 SOLUTIONS The Way of Analysis p. 98: 1.) Suppose that A is open. Show that A minus a finite set is still open. This follows by induction as long as A minus one point x is still open. To see that A

More information

Two-player games : a reduction

Two-player games : a reduction Two-player games : a reduction Nicolas Vieille December 12, 2001 The goal of this chapter, together with the next one, is to give a detailed overview of the proof of the following result. Theorem 1 Let

More information

Denotational Semantics

Denotational Semantics 5 Denotational Semantics In the operational approach, we were interested in how a program is executed. This is contrary to the denotational approach, where we are merely interested in the effect of executing

More information

Set, functions and Euclidean space. Seungjin Han

Set, functions and Euclidean space. Seungjin Han Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,

More information

Economics 209A Theory and Application of Non-Cooperative Games (Fall 2013) Extensive games with perfect information OR6and7,FT3,4and11

Economics 209A Theory and Application of Non-Cooperative Games (Fall 2013) Extensive games with perfect information OR6and7,FT3,4and11 Economics 209A Theory and Application of Non-Cooperative Games (Fall 2013) Extensive games with perfect information OR6and7,FT3,4and11 Perfect information A finite extensive game with perfect information

More information

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries Chapter 1 Measure Spaces 1.1 Algebras and σ algebras of sets 1.1.1 Notation and preliminaries We shall denote by X a nonempty set, by P(X) the set of all parts (i.e., subsets) of X, and by the empty set.

More information

The Complexity of Nash Equilibria in Simple Stochastic Multiplayer Games *

The Complexity of Nash Equilibria in Simple Stochastic Multiplayer Games * The Complexity of Nash Equilibria in Simple Stochastic Multiplayer Games * Michael Ummels 1 and Dominik Wojtczak 2,3 1 RWTH Aachen University, Germany E-Mail: ummels@logic.rwth-aachen.de 2 CWI, Amsterdam,

More information

Tijmen Daniëls Universiteit van Amsterdam. Abstract

Tijmen Daniëls Universiteit van Amsterdam. Abstract Pure strategy dominance with quasiconcave utility functions Tijmen Daniëls Universiteit van Amsterdam Abstract By a result of Pearce (1984), in a finite strategic form game, the set of a player's serially

More information

Alternating nonzero automata

Alternating nonzero automata Alternating nonzero automata Application to the satisfiability of CTL [,, P >0, P =1 ] Hugo Gimbert, joint work with Paulin Fournier LaBRI, Université de Bordeaux ANR Stoch-MC 06/07/2017 Control and verification

More information

Parity Objectives in Countable MDPs

Parity Objectives in Countable MDPs Parity Objectives in Countable MDPs Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Dominik Wojtczak University of Oxford, UK University of Edinburgh, UK University of Liverpool, UK Abstract We study

More information

DO FIVE OUT OF SIX ON EACH SET PROBLEM SET

DO FIVE OUT OF SIX ON EACH SET PROBLEM SET DO FIVE OUT OF SIX ON EACH SET PROBLEM SET 1. THE AXIOM OF FOUNDATION Early on in the book (page 6) it is indicated that throughout the formal development set is going to mean pure set, or set whose elements,

More information

arxiv: v2 [cs.lo] 9 Apr 2012

arxiv: v2 [cs.lo] 9 Apr 2012 Mean-Payoff Pushdown Games Krishnendu Chatterjee 1 and Yaron Velner 2 1 IST Austria 2 Tel Aviv University, Israel arxiv:1201.2829v2 [cs.lo] 9 Apr 2012 Abstract. Two-player games on graphs are central in

More information

Notes on Coursera s Game Theory

Notes on Coursera s Game Theory Notes on Coursera s Game Theory Manoel Horta Ribeiro Week 01: Introduction and Overview Game theory is about self interested agents interacting within a specific set of rules. Self-Interested Agents have

More information

Robust Controller Synthesis in Timed Automata

Robust Controller Synthesis in Timed Automata Robust Controller Synthesis in Timed Automata Ocan Sankur LSV, ENS Cachan & CNRS Joint with Patricia Bouyer, Nicolas Markey, Pierre-Alain Reynier. Ocan Sankur (ENS Cachan) Robust Control in Timed Automata

More information

Mixed Nash Equilibria

Mixed Nash Equilibria lgorithmic Game Theory, Summer 2017 Mixed Nash Equilibria Lecture 2 (5 pages) Instructor: Thomas Kesselheim In this lecture, we introduce the general framework of games. Congestion games, as introduced

More information

On positional strategies over finite arenas

On positional strategies over finite arenas On positional strategies over finite arenas Damian Niwiński University of Warsaw joint work with Thomas Colcombet Berlin 2018 Disclaimer. Credits to many authors. All errors are mine own. 1 Perfect information

More information

Games and Their Equilibria

Games and Their Equilibria Chapter 1 Games and Their Equilibria The central notion of game theory that captures many aspects of strategic decision making is that of a strategic game Definition 11 (Strategic Game) An n-player strategic

More information

STRUCTURES WITH MINIMAL TURING DEGREES AND GAMES, A RESEARCH PROPOSAL

STRUCTURES WITH MINIMAL TURING DEGREES AND GAMES, A RESEARCH PROPOSAL STRUCTURES WITH MINIMAL TURING DEGREES AND GAMES, A RESEARCH PROPOSAL ZHENHAO LI 1. Structures whose degree spectrum contain minimal elements/turing degrees 1.1. Background. Richter showed in [Ric81] that

More information

VC-DENSITY FOR TREES

VC-DENSITY FOR TREES VC-DENSITY FOR TREES ANTON BOBKOV Abstract. We show that for the theory of infinite trees we have vc(n) = n for all n. VC density was introduced in [1] by Aschenbrenner, Dolich, Haskell, MacPherson, and

More information

Automata, Logic and Games: Theory and Application

Automata, Logic and Games: Theory and Application Automata, Logic and Games: Theory and Application 2 Parity Games, Tree Automata, and S2S Luke Ong University of Oxford TACL Summer School University of Salerno, 14-19 June 2015 Luke Ong S2S 14-19 June

More information

Automata, Logic and Games: Theory and Application

Automata, Logic and Games: Theory and Application Automata, Logic and Games: Theory and Application 1. Büchi Automata and S1S Luke Ong University of Oxford TACL Summer School University of Salerno, 14-19 June 2015 Luke Ong Büchi Automata & S1S 14-19 June

More information

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 Daron Acemoglu and Asu Ozdaglar MIT October 14, 2009 1 Introduction Outline Mixed Strategies Existence of Mixed Strategy Nash Equilibrium

More information

Perfect-Information Stochastic Mean-Payo Parity Games

Perfect-Information Stochastic Mean-Payo Parity Games Perfect-Information Stochastic Mean-Payo Parity Games Krishnendu Chatterjee and Laurent Doyen and Hugo Gimbert and Youssouf Oualhadj Technical Report No. IST-2013-128-v1+1 Deposited at UNSPECIFIED http://repository.ist.ac.at/128/1/full

More information

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication)

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication) Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication) Nikolaus Robalino and Arthur Robson Appendix B: Proof of Theorem 2 This appendix contains the proof of Theorem

More information

SFM-11:CONNECT Summer School, Bertinoro, June 2011

SFM-11:CONNECT Summer School, Bertinoro, June 2011 SFM-:CONNECT Summer School, Bertinoro, June 20 EU-FP7: CONNECT LSCITS/PSS VERIWARE Part 3 Markov decision processes Overview Lectures and 2: Introduction 2 Discrete-time Markov chains 3 Markov decision

More information

Extremal Solutions of Inequations over Lattices with Applications to Supervisory Control 1

Extremal Solutions of Inequations over Lattices with Applications to Supervisory Control 1 Extremal Solutions of Inequations over Lattices with Applications to Supervisory Control 1 Ratnesh Kumar Department of Electrical Engineering University of Kentucky Lexington, KY 40506-0046 Email: kumar@engr.uky.edu

More information

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1)

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1) 1.4. CONSTRUCTION OF LEBESGUE-STIELTJES MEASURES In this section we shall put to use the Carathéodory-Hahn theory, in order to construct measures with certain desirable properties first on the real line

More information

Strategy Logic. 1 Introduction. Krishnendu Chatterjee 1, Thomas A. Henzinger 1,2, and Nir Piterman 2

Strategy Logic. 1 Introduction. Krishnendu Chatterjee 1, Thomas A. Henzinger 1,2, and Nir Piterman 2 Strategy Logic Krishnendu Chatterjee 1, Thomas A. Henzinger 1,2, and Nir Piterman 2 1 University of California, Berkeley, USA 2 EPFL, Switzerland c krish@eecs.berkeley.edu, {tah,nir.piterman}@epfl.ch Abstract.

More information

CHURCH SYNTHESIS PROBLEM and GAMES

CHURCH SYNTHESIS PROBLEM and GAMES p. 1/? CHURCH SYNTHESIS PROBLEM and GAMES Alexander Rabinovich Tel-Aviv University, Israel http://www.tau.ac.il/ rabinoa p. 2/? Plan of the Course 1. The Church problem - logic and automata. 2. Games -

More information

Decidable and Expressive Classes of Probabilistic Automata

Decidable and Expressive Classes of Probabilistic Automata Decidable and Expressive Classes of Probabilistic Automata Yue Ben a, Rohit Chadha b, A. Prasad Sistla a, Mahesh Viswanathan c a University of Illinois at Chicago, USA b University of Missouri, USA c University

More information

A subexponential lower bound for the Random Facet algorithm for Parity Games

A subexponential lower bound for the Random Facet algorithm for Parity Games A subexponential lower bound for the Random Facet algorithm for Parity Games Oliver Friedmann 1 Thomas Dueholm Hansen 2 Uri Zwick 3 1 Department of Computer Science, University of Munich, Germany. 2 Center

More information

Documentos de trabajo. A full characterization of representable preferences. J. Dubra & F. Echenique

Documentos de trabajo. A full characterization of representable preferences. J. Dubra & F. Echenique Documentos de trabajo A full characterization of representable preferences J. Dubra & F. Echenique Documento No. 12/00 Diciembre, 2000 A Full Characterization of Representable Preferences Abstract We fully

More information

A Note on the Existence of Ratifiable Acts

A Note on the Existence of Ratifiable Acts A Note on the Existence of Ratifiable Acts Joseph Y. Halpern Cornell University Computer Science Department Ithaca, NY 14853 halpern@cs.cornell.edu http://www.cs.cornell.edu/home/halpern August 15, 2018

More information

Notes on Blackwell s Comparison of Experiments Tilman Börgers, June 29, 2009

Notes on Blackwell s Comparison of Experiments Tilman Börgers, June 29, 2009 Notes on Blackwell s Comparison of Experiments Tilman Börgers, June 29, 2009 These notes are based on Chapter 12 of David Blackwell and M. A.Girshick, Theory of Games and Statistical Decisions, John Wiley

More information

5 Set Operations, Functions, and Counting

5 Set Operations, Functions, and Counting 5 Set Operations, Functions, and Counting Let N denote the positive integers, N 0 := N {0} be the non-negative integers and Z = N 0 ( N) the positive and negative integers including 0, Q the rational numbers,

More information

On Recognizable Languages of Infinite Pictures

On Recognizable Languages of Infinite Pictures On Recognizable Languages of Infinite Pictures Equipe de Logique Mathématique CNRS and Université Paris 7 JAF 28, Fontainebleau, Juin 2009 Pictures Pictures are two-dimensional words. Let Σ be a finite

More information