Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Disarmament Games

Size: px
Start display at page:

Download "Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Disarmament Games"

Transcription

1 Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Disarmament Games Yuan Deng, Vincent Conitzer Department of Computer Science Duke University Durham, NC 27708, USA Abstract Much recent work in the AI community concerns algorithms for computing optimal mixed strategies to commit to, as well as the deployment of such algorithms in real security applications. Another possibility is to commit not to play certain actions. If only one player makes such a commitment, then this is generally less powerful than completely committing to a single mixed strategy. However, if players can alternatingly commit not to play certain actions and thereby iteratively reduce their strategy spaces, then desirable outcomes can be obtained that would not have been possible with just a single player committing to a mixed strategy. We refer to such a setting as a disarmament game. In this paper, we study disarmament for two-player normal-form games. We show that deciding whether an outcome can be obtained with disarmament is NP-complete (even for a fixed number of rounds), if only pure strategies can be removed. On the other hand, for the case where mixed strategies can be removed, we provide a folk theorem that shows that all desirable utility profiles can be obtained, and give an efficient algorithm for (approximately) obtaining them. Introduction Disarmament is often a desired objective in international relations, but it is not always easy to reach the end goal. A key problem is that by removing military assets, a country may leave itself vulnerable to attack if the other country does not disarm. Therefore, disarmament typically happens in a sequence of carefully designed stages, so that neither country is ever too exposed at any stage. Besides reductions in military assets, we can also take disarmament as a metaphor for other strategic situations. For example, two companies may each hold a portfolio of patents that could be used to inflict significant damage on the other company, and the companies may wish to make a sequence of legal agreements to reduce the risk on both sides. In these situations, once one of the players deviates from the disarmament protocol, the possible actions each side has remaining can strategically interact in complex ways to determine the final payoffs realized. In order to achieve a high level of generality, in this paper, we consider disarmament in general two-player normalform games, as illustrated by the following example. Copyright c 2017, Association for the Advancement of Artificial Intelligence ( All rights reserved. Example 1 (Extended Prisoner s Dilemma). Consider the following modified version of the prisoner s dilemma. Cooperate Defect Painful Cooperate 3,3 0,4 0.1,0 Defect 4,0 1,1 0.5,0.5 Painful 0, ,0.5 0,0 Table 1: Payoff matrix of Extended Prisoner s Dilemma Strategy Defect strictly dominates strategies Cooperate and Painful for both players. Thus, the only Nash (or even correlated, or coarse correlated) equilibrium of this game is (Defect, Defect), with utilities (1, 1). If a single player can commit, this does not help, because the other player would still play Defect. On the other hand, suppose both players can alternatingly remove their strategies and act according to the following protocol. In the first round, Row removes Defect. Since Defect is still Column s dominant strategy, the only Nash equilibrium of the reduced game is (Painful, Defect) with utilities (0.5, 0.5). Next, Column removes his strategy Defect. In the remaining game, Cooperate has become a dominant strategy, resulting in utilities (3, 3). At each step in this protocol of removing strategies, deviating from the protocol and playing the game remaining at that point is dominated by (3, 3). Thus, both players are best off following the disarmament protocol. In this paper, we first formalize the idea of a disarmament game, as played on top of a game represented in normal form (as illustrated in Example 1). We introduce the computational problem DISARM, which asks whether there is an equilibrium of the disarmament game leading to some desired specified outcome, and a variant K-DISARM in which there are only K rounds of disarmament. We show both problems to be NP-complete. We then introduce a mixed disarmament variant that allows the removal of mixed strategies, by upper-bounding the probabilities on individual pure strategies. Here our results are positive: we show a type of folk theorem holds (without repetition of the game!), namely that for any feasible utilities that exceed players security levels, there is an equilibrium achieving at least those utilities. Our proof is constructive, and in fact shows that we can approximately obtain the desired result in approximate equilibrium using only few rounds of disarmament. 473

2 Related Work Game-theoretic commitment, especially to mixed strategies, has received significant recent attention in the multiagent systems literature, in large part due to their application in various security domains (Tambe 2011). In a two-player normal-form game, an optimal mixed strategy to commit to can be found in polynomial time (Conitzer and Sandholm 2006; von Stengel and Zamir 2010), though there are hardness results for Bayesian and extensiveform games (Conitzer and Sandholm 2006; Letchford, Conitzer, and Munagala 2009; Letchford and Conitzer 2010; Bošanský et al. 2015). There is also significant interest in other types of action one can take before the game in order to make the outcome more favorable. This includes promising payments if a certain outcome is reached (Monderer and Tennenholtz 2004; Anderson, Shoham, and Altman 2010; Deng, Tang, and Zheng 2016) or getting to choose from multiple possible utilities in the entries (Brill, Freeman, and Conitzer 2016). Also, mechanism design involves a special player, the designer, committing to the mechanism beforehand. But in all these cases, there is only one commitment step before the game is played, unlike in this paper. Definitions We define disarmament games based on the normal-form representation, as in the example provided in the introduction. We restrict attention to two-player games throughout. A normal-form game is defined by G = S 0,S 1,u 0,u 1, such that for each player b, his set of pure strategies is S b and his utility function is u b : S 0 S 1 R, where u b (s 0,s 1 ) denotes player b s utility when player 0 plays s 0 and player 1 plays s 1. Moreover, for T 0 S 0 and T 1 S 1, the game induced by T 0 and T 1 is the two-player normal-form game G T0,T 1 = T 0,T 1,u 0,u 1, where u 0 and u 1 are restricted to T 0 T 1. As usual, we use b to denote the player other than b. For convenience, the utility for each player is normalized into the interval [0, 1]. We now define the disarmament game G D (G) on top of this normal-form game. This disarmament game consists of a disarmament stage during which players alternatingly remove nonempty sets of strategies from S 0 and S 1, and a game play stage triggered when a player removes nothing during which they play whatever normalform game remains. Note that many disarmament sequences can result in the same state [T 0,T 1,b] (where b is the player to move); rather than duplicate this state many times in the game tree, we represent the tree as a directed acyclic graph (DAG) in which each state occurs only once. Note the game is one of perfect information, except that the players move simultaneously in the game play stage. We now present the extensive form of the game precisely. 1 Definition 1 (Disarmament Game). The disarmament game G D (G) is defined as an extensive-form game as follows. 1 Formally, what we present is not exactly the extensive form because (1) we use a DAG rather than a tree and (2) the terminal nodes are associated with the game to be played in the game-play stage rather than directly with utilities, but it is straightforward to extract the formal extensive form from this. The set of disarmament actions A: {X 0 X 0 S 0 } {X 1 X 1 S 1 } {Play}, where X b denotes the set of strategies to keep and Play denotes ending the disarmament stage; The set of non-terminal nodes H: {[T 0,T 1,b] T 0 S 0, T 1 S 1,b {0, 1}}; The set of terminal nodes Z: {[T 0,T 1 ] T 0 S 0, T 1 S 1 }; The player selection function ρ : H {0, 1}: ρ([t 0,T 1,b]) = b; The available-actions function χ : H 2 A, where : χ([t 0,T 1,b]) = {X b X b T b } {Play}; The successor function γ : H A H Z: γ([t 0,T 1, 0],X 0 )=[X 0,T 1, 1]; γ([t 0,T 1, 1],X 1 )=[T 0,X 1, 0]; γ([t 0,T 1,b], Play) =[T 0,T 1 ]; The root of the game is root =[S 0,S 1, 0]. In the terminal nodes z =[T 0,T 1 ], players 0 and 1 play the normal-form game G T0,T 1, resulting in their final utilities. Definition 2 (Strategy in G D ). A strategy σ b = (α, β) for G D consists of disarmament strategy α h H ρ(h)=b χ(h) for non-terminal nodes and play strategy β z=[t Δ(T 0,T 1] Z b) for terminal nodes (where Δ(T b ) is the set of distributions over T b ). Note we restrict our attention to deterministic behavior during the (perfect-information) disarmament stage. Definition 3 (On-path history & outcome). For a strategy profile (σ 0,σ 1 ) with σ 0 = (α 0,β 0 ) and σ 1 = (α 1,β 1 ), denote the on-path history by P = (h 0 = root, h 1,,h K,h K+1 = z =[T 0,T 1 ]) where for all i, γ(h i,α ρ(hi)(h i )) = h i+1. The outcome of the strategy profile (σ 0,σ 1 ) is (β 0 (z),β 1 (z)). We say an on-path history has length K if it contains K non-terminal nodes, excluding the root. In a slight abuse of notation, let u b (o) be player b s (expected) utility for outcome o. A strategy profile (σ 0,σ 1 ) forms a Nash equilibrium if and only if no player can increase his utility by deviating to another strategy. We consider the following computational problem: Definition 4 (DISARM problem). In DISARM, given a disarmament game G D (G) and an outcome o =(β 0,β 1), the objective is to determine whether there exists a Nash equilibrium (σ 0,σ 1) such that the outcome is o. We also consider a variation of DISARM problem, called K-DISARM. Definition 5 (K-DISARM problem). In K-DISARM, given a disarmament game G D (G) and an outcome o = (β 0,β 1), the objective is to determine whether there exists a Nash equilibrium (σ 0,σ 1) such that the outcome is o and the length of its induced on-path history is at most K. 474

3 Computational Complexity Since the number of game states is exponential, an efficient algorithm cannot output the entire strategy profile under the standard representation that lists each player s action at every game state. We next show that a restricted class of strategies that can be represented efficiently suffices. These strategies directly specify the on-path behavior and require that minimax strategies are used off-path. On-path Histories are Sufficient Recall that the on-path history of a profile of strategies in an extensive-form game consists of the actions that the players take when nobody deviates. For every non-terminal node in the on-path history, one can define the security level for each player. Definition 6 (Security level). The security level sec b for player b in a two-player normal-form game G = T 0,T 1,u 0,u 1 is the utility that player b can guarantee himself no matter how the other player plays. Formally, sec b (G T0,T 1 )= max β b Δ(T b ) min u b(β b,β b ) We are going to show that the security level is the essential quantity to determine whether an on-path history can be induced by a Nash equilibrium strategy profile or not. In order for a specific outcome to be reached in Nash equilibrium, every player must have strong enough incentive to follow the on-path history, rather than deviating, at any point in the on-path history. To minimize incentive to deviate, we may assume that if someone deviates, the other player will act to minimize the deviator s utility. This does not prevent the latter player s strategy from being a best response, because such punishment will not occur on the path of play. The most effective punishment will be to bring the deviator down to his security level, which is possible by the minimax theorem (von Neumann 1928). Lemma 1. P is an on-path history induced by a Nash equilibrium strategy profile with outcome o if and only if for each non-terminal node [T 0,T 1,b] P, sec b (G T0,T 1 ) u b (o). Proof. : If there exists a non-terminal node h = [T 0,T 1,b] P such that sec b (G T0,T 1 ) > u b (o), then at the node h, player b can deviate to choose Play and play a strategy in arg max βb Δ(T b ) min β b Δ(T b ) u b (β b,β b ) in the induced game to guarantee himself a utility at least sec b (G T0,T 1 ), which is larger than u b (o). : If for each non-terminal node h = [T 0,T 1,b] P, sec b (G T0,T 1 ) u b (o), consider a strategy profile that specifies to: For nodes in the on-path history, choose the action to follow the on-path history; and for every non-terminal node not in the on-path history, choose Play; For an off-path terminal node z = [X b,t b ] with X b T b (where this node was reached because b deviated from the path), player b plays arg min β b Δ(T b ) max βb Δ(X b ) u b (β b,β b ). We claim that such a strategy profile forms a Nash equilibrium with outcome o. Suppose, for the sake of contradiction, that player b benefits from deviating to another strategy resulting in induced on-path history P. Let the longest common prefix of P and P be P pre, ending with h = [T 0,T 1,b]. Thus, player b deviates at h. Then, the disarmament stage ends in at most one round after h, since either player b deviates to choose Play, or player b chooses a different subset X b to keep but player b chooses Play immediately after that. Therefore, the induced game is G Xb,T b with X b T b, and b will act to minimize b s utility. By the minimax theorem, min = max β b Δ(X b ) max β b Δ(T b ) max u b(β b,β b ) β b Δ(X b ) min u b(β b,β b ) min u b(β b,β b ) u b (o) This proves that we can restrict our attention to strategy profiles that explicitly specify on-path play and implicitly assume immediate Play and minimax punishment of deviators off-path. Moreover, note that the length of an on-path history is O( S 0 + S 1 ) and the security level can be computed via a linear programming. Hence, an on-path history serves as a polynomial-length, polynomially verifiable certificate for the DISARM (or K-DISARM) problem, which is hence in NP. Complexity of K-DISARM and DISARM In an on-path history with length K, the total number of turns in which a player chooses an action other than Play is at most K. ForK 2, K-DISARM is in P since there is a unique on-path history leading to the node o, namely the one where each player removes all the strategies he needs to remove in his one (non-play) disarmament turn. We now show that 3-DISARM is NP-complete by a reduction from the BALANCED-VERTEX-COVER problem. Definition 7. In VERTEX-COVER, the objective is to check in graph (V,E) whether there exists a subset of the vertices V V, with V = L, such that every edge e E has at least one of its endpoints in V. BALANCED-VERTEX-COVER is the special case of VERTEX-COVER in which L = V /2. BALANCED-VERTEX-COVER is NP-complete via a reduction from VERTEX-COVER (Conitzer and Sandholm 2006). Theorem 1. 3-DISARM is NP-complete. Proof. Given an instance of BALANCED-VERTEX- COVER with V = n and E = m, we construct a twoplayer normal-form game G(V,E) = S 0,S 1,u 0,u 1,in which S 0 = {l} V and S 1 = {l} V E. The utilities are defined as follows: U 0 (l, l) =U 1 (l, l) =1 2 n ; U 0 (v, l) =2for all v V ; 475

4 U 0 (v, v )=1for all v, v V and v v ; U 1 (l, e) =2for all e E; U 1 (v, e) =1for all e E, v V and v e; and unspecified utilities are simply 0. The desired outcome o is (l, l). If there exists a BALANCED-VERTEX-COVER with vertices V then the on-path history with h 1 = [{l} V,S 1, 1], h 2 =[{l} V, {l} V,0] and h 3 =[{l}, {l} V,1] is induced by a Nash equilibrium strategy profile with outcome o by Lemma 1. This is because if player 0 deviates in the first round, player 1 can punish player 0 by playing any strategy e E; if player 1 deviates in the second round, player 0 can punish player 1 by playing uniformly from V, which will result in a utility of 0 for player 1 at least 2/n of the time because V is a vertex cover of size n/2; and if player 0 deviates in the third round, player 1 can punish player 0 by playing uniformly from V, which will result in a utility of 0 for player 0 at least 2/n of the time. For the other direction, suppose there exists an on-path history leading to o induced by a Nash equilibrium strategy profile. Note that (l, l) cannot be a Nash equilibrium in the induced game if for the terminal node z =[T 0,T 1 ], V T 0 or E T 1. Therefore, we have T 0 = {l} and T 1 {l} V. In 3-DISARM, there are at most 4 different induced games in the on-path history. These must be G S0,S 1, G S 0,S 1, G S 0,T 1 and G {l},t1, where S 0 = {l} V for some V. According to Lemma 1, we must have sec 1 (G S 0,S 1 ) 1 2 n and sec 0(G S 0,T 1 ) 1 2 n. In G S 0,S 1, since U 1 (v, e) = 1 for all e E, v V and v e, if there exists an e E uncovered by V, then player 1 can deviate to Play and then play strategy e to guarantee himself utility 1. Thus, V must be a vertex cover. As for G S 0,T 1,if V >n/2, then player 0 can play uniformly among the strategies in V to guarantee himself utility 1 1 V > 1 2 n. Thus, V cannot be larger than n/2. Therefore, V is a solution to the BALANCED-VERTEX- COVER problem. The fact only 3 rounds are available is essential for the reduction above to work: if there are more rounds, disarmament may be possible even without a balanced vertex cover, by alternatingly removing vertices for player 0 and edges for player 1 while ensuring that the remaining vertices for player 0 form a vertex cover for the remaining edges only. However, with a number of modifications to the reduction, we can prove that any successful disarmament of the modified game requires a balanced vertex cover for all the edges at some point in the process. We defer all remaining proofs to the full version due to space limitations, Theorem 2. DISARM is NP-complete. Mixed Disarmament So far, we have considered only removing pure strategies. But pure disarmament has its limitations. Consider the standard prisoner s dilemma: If either player removes his Defect strategy, the other player has no motivation to do the same: he would prefer C D C 3,3 0,4 D 4,0 1,1 Table 2: Payoff matrix for the prisoner s dilemma to play Defect. The former player, anticipating this, would not remove Defect, either. However, now suppose that the players are able to remove mixed strategies. It turns out under this setting, there is a way to get to cooperation. Example 2. Consider the prisoner s dilemma as presented above, and suppose players can reduce their strategy spaces by limiting the maximum probability they can put on D. No matter how players reduce their strategy space, it is always a dominant strategy to put as much probability on Defect as possible. Therefore, in any sequence of disarmament steps, the player who should take the last disarmament step that reduces the probability he can put on D will have no incentive to do so. As a result it is impossible to reduce the probability that either player puts on D at all. This is reminiscent of how in a finitely repeated prisoner s dilemma, cooperation cannot be attained. But the same is not true for infinitely repeated prisoner s dilemma, per the folk theorem. It turns out we can make a similar move here. Suppose that after each disarmament step, we flip a coin that comes up Heads with probability δ. If it comes up Heads, the players stop disarming and play the game. Otherwise, disarmament continues. In this way, neither player ever knows that she is about to take the last disarmament step. Specifically, consider the following procedure. In each disarmament step, each player reduces the maximum probability he can put on D by a factor k (0, 1), so that once each player has taken t disarmament steps, he can put at most probability k t on D. Once the coin comes up Heads, of course both players will put maximum probability on D. For player 0, the expected utility of continuing to follow the protocol, given that both players have already reduced to k t, is (1 δ) 2t δu 0 (k t+1+t,k t+t ) + t =0 (1 δ) 2t +1 δu 0 (k t+1+t,k t+1+t ) t =0 where u 0 (p, q) =1pq +0(1 p)q +4p(1 q) + 3(1 p)(1 q) is the utility to player 0 when the players play the game with limits p and q on D, respectively. On the other hand, if player 0 deviates at this point and is immediately punished by player 1 choosing Play, then 0 obtains utility at most u 0 (k t,k t ). In order for player 0 to be best off following the protocol, the former needs to be no less than the latter; after simplification, we obtain (3δ 2)(k 1) 0. Since k<1, we must have δ 2 3. Similarly, for player 1, after simplification, we have (k 1)(k 1 3(1 δ) ) 0. Since δ 2 3, we have 1 δ 1 3. Therefore, 1 3(1 δ) k<1. 476

5 To avoid dealing with nasty expressions involving δ in what follows, we observe that in the limit as δ 0, the utility for following the protocol converges to 3, which is the utility that would be obtained after infinitely many rounds of disarmament. We can use this fiction that following the protocol results in playing the game after infinitely many rounds of disarmament to facilitate the analysis. Specifically, in this infinite-length model, all that is needed to ensure that nobody deviates is that the utility for deviating never exceeds 3. (In Lemma 2, we prove formally that as δ 0, we reach the infinite-length model in the limit.) Example 3. Consider now the following slightly different version of the prisoner s dilemma in which a player receives utility 12 instead of 4 when he plays D while the other player plays C. As it turns out, in this game it is not possible to reach the (C, C) outcome with the infinitedisarmament model. The problem is that if we were reach a point where both players have achieved certain values for the upper bound on D for example, 1/2 then deviation at that point would give at least =4, which is more than the 3 from reaching (C, C). Moreover we cannot jump over these values. On the other hand, this is not really a problem, in the sense that both players would actually be better off with this profile than with (C, C). Thus, this example is consistent with the following claim: any sufficiently high utilities that occur in the game can be obtained via disarmament. Specifically, we can attain utilities (4, 4) for the players with the following disarmament sequence (in the infinitelength model). Let p t be the maximum probability on D after 2t disarmament steps, initialized to p 0 = 1. Both players take turns to reduce the maximum probability to p t+1 = 9pt 1 8p, which converges to 1 t+3 2 as t. In this case, player 1 s utility from deviation after t steps is exactly 4 (so player 1 is indifferent 2 ) while player 0 s incentive to deviate is smaller. Motivation. In practice, it is easy to imagine how one could commit to not play a given pure strategy, but perhaps this is more difficult for the case of mixed strategies. Of course, the issue is similar for the case of committing to a mixed strategy, and yet this has found plenty of applications. Commitment to a mixed strategy is usually achieved by reputation, for example by the follower observing the leader s actions many times before acting. The same might work in our model. Alternatively, one could look for some unrelated random variable, such as the weather. If one can make it so that one cannot play a given pure strategy if it is (say) cloudy, then this upper-bounds the probability of the strategy at 70% (assuming it is cloudy 30% of the time). Finally, rather than interpreting mixed strategies literally as probability distributions, one can think of these probabilities as representing a quantitative choice, such as how much of one s continuum of resources to devote to a particular course of action, and then placing upper bounds on this. 2 Player 1 can be made to strictly prefer not deviating by reducing the upper bounds at a slightly slower rate. Formal Definition and Basic Properties In mixed disarmament, each player can reduce his strategies by reducing the maximum probabilities on each of his pure strategies (with the requirement that at least one of these decreases strictly), or choose Play. For convenience, assume S 0 = S 1 =[n] and let 0 (and 1) denote a vector of all zeroes (ones). Therefore, the representation of the game state becomes [p 0,p 1,b] where p b Δ + (S b ) is the vector of probability upper bounds for player b. Here, Δ + (S b ) denotes vectors of probabilities summing to at least 1. In the induced game G p0,p 1, player b s strategy β b must satisfy β b p b point-wisely. Strategies, on-path histories, and outcomes in mixed disarmament games are defined similarly as in pure disarmament games. For convenience, let C(p) = {μ μ 1 = 1 and 0 μ p} denote the set of probability distributions consistent with upper bounds p. The security level in a mixed disarmament game is defined as follows. Definition 8 (Security level in mixed disarmament game). The security level sec b for player b in a two-player normalform game G = p 0,p 1,u 0,u 1 is the utility that player b can guarantee himself no matter how the other player plays. Formally, sec b (G p0,p 1 )= max β b C(p b ) min u b(β b,β b ) β b C(p b ) Note that in game G p0,p 1, C(p 0 ) C(p 1 ) is a compact convex set. Therefore, the minimax theorem still holds for G p0,p 1 (von Neumann 1928), and so does Lemma 1. Now, we are ready to demonstrate that as δ 0, we reach the infinite-length model in the limit. Throughout, ε-equilibrium denotes approximate Nash equilibrium in the (standard) additive sense. Lemma 2. Consider a Nash equilibrium (σ 0,σ 1 ) = ((α 0,β 0 ), (α 1,β 1 )) of the infinite-length disarmament game, leading to 3 terminal node z =[p 0,p 1]. Then for any ɛ>0, there exists some D>0such that for any 0 <δ<d, we have that (σ 0,σ 1 ) is an ɛ-equilibrium of the δ-coin-toss disarmament game (where (β 0 (z),β 1 (z)) is played when the coin toss lands Heads). A Folk Theorem for Mixed Disarmament The folk theorem for repeated games shows that any utilities that exceed the players security levels can be obtained as an equilibrium of the infinitely repeated game (as the discount factor approaches 1). We now show a similar result for (non-repeated) mixed disarmament games. (For pure disarmament games, the prisoner s dilemma provides a counterexample.) Let G M D (G) denote the infinite-length mixed disarmament game resulting from normal-form game G, and let utilities (v 0,v 1 ) be feasible for G if there exists a mixedstrategy profile of G that results in these utilities. Theorem 3. In mixed disarmament games G M D (G), for all feasible utilities (v 0,v 1 ) with v 0 > sec 0 (G) and v 1 > 3 Here, leading to means either that we terminate at this node after finitely many rounds, or that the disarmament stage continues forever but the upper bounds converge to [p 0,p 1]. 477

6 sec 1 (G), there exists a Nash equilibrium of G M D (G) such that the terminal node z =[p 0,p 1 ] of its induced on-path history satisfies sec 0 (G p0,p 1 )=v 0 and sec 1 (G p0,p 1 )=v 1. For any (β 0,β 1 ) constituting an equilibrium of G p0,p 1, we have u b (β 0,β 1 ) sec b (G p0,p 1 ); otherwise, player b can deviate to the strategy that guarantees his security level. Corollary 1. In mixed disarmament games G M D (G), for all feasible utilities (v 0,v 1 ) with v 0 > sec 0 (G) and v 1 > sec 1 (G), there exists a Nash equilibrium of G M D (G) such that its outcome o satisfies u 0 (o) v 0 and u 1 (o) v 1. In Theorem 3 and Corollary 1, the induced on-path history of the Nash equilibrium may have infinite length. Of course, by Lemma 2, we can come arbitrarily close to this with a game that will terminate in finite time with probability 1. A Constructive Proof In this subsection, we provide a constructive proof for Theorem 3 via an algorithm to generate the corresponding onpath history. Following the intuition provided in Example 3, in each round, the player reduces his upper bounds as much as possible, i.e., until his opponent s security level equals his target utility. Formally, given target utilities (v 0,v 1 ) with corresponding strategy profile (β 0,β 1 ), for each player b, let β b = 1 β b. We introduce two parameters θ 0 and θ 1, initialized to 1, such that the upper bound vectors are p 0 = β 0 + θ 0 β 0 and p 1 = β 1 + θ 1 β 1. For convenience, let G β0,β 1,θ 0,θ 1 = G β0+θ 0β 0. Define an update function f:,β1+θ1β 1 f(b, θ b,θ b )=inf{θ b sec b (G βb,β b,θ b,θ ) v b b} That is, when p 0 = β 0 + θ 0 β 0, p 1 = β 1 + θ 1 β 1 and it is player b s turn, function f returns the minimum θ b such that player b s security level in the induced game is still lower than or equal to the target utility. The function f(b, θ b,θ b ) can be computed efficiently by linear programming: minimize θ b s.t. u b (μ b,μ) v b μ C(β b,θ b ) μ b C(β b,θ b ) where C(β,θ) =C(β + θ(1 β)). This program has infinitely many constraints of the first type, so we need to show that an efficient separation oracle exists. The first type of constraints is equivalent to max u b(μ b,μ) v b μ C(β b,θ b ) whose left-hand side can be computed by a simple waterfilling method that puts as much probability as possible on the remaining strategy that provides player b the most utility, given that player b plays mixed strategy μ b. Therefore, an efficient separation oracle exists. We now present the algorithm for generating the on-path history (Algorithm 1). Of course, if we require an infinitelength history, this algorithm will not terminate, but every point in the history will be eventually generated by it. Lemma 3. Algorithm 1 produces an on-path history that is part of a Nash equilibrium and leads to a terminal node z = [p 0,p 1 ] where sec 0 (G p0,p 1 )=v 0 and sec 1 (G p0,p 1 )=v 1. Algorithm 1: Generate on-path strategy for feasible utilities that exceed security levels Input: G and a target strategy profile (β 0,β 1 ) Output: An on-path history P for G M D (G) 1 Let β 0 = 1 β 0 ; 2 Let β 1 = 1 β 1 ; 3 Let h 0 =[1, 1, 0], b =0, t =0; 4 Let θ 0 = θ 1 =1; 5 while θ b f(b, θ b,θ b ) > 0 do 6 θ b = f(b, θ b,θ b ); 7 h t+1 =[β 0 + θ 0 β 0,β 1 + θ 1 β 1, 1 b]; 8 b =1 b; 9 t = t +1; 10 h t+1 =[β 0 + θ 0 β 0,β 1 + θ 1 β 1]; 11 return h; Convergence Rate Theorem 3 guarantees the existence of a Nash equilibrium with the desired utilities, but its induced on-path history may have infinite length. From Lemma 2, we know we can approximate this with a path that (with probability 1) has finite length, but this still does not tell us whether this is a reasonable number of rounds. We next show that we can in fact approximate it in a reasonable number of rounds. Instead of using coin tosses, here we simply stop after O(T ) rounds, resulting in an O(n/T ) approximate equilibrium where the security levels are within O(n/T ) of the desired values. Theorem 4. In mixed disarmament games G M D (G), for all feasible utilities (v 0,v 1 ) with v 0 > sec 0 (G) and v 1 > sec 1 (G), and ε>0, there exists an nε-nash equilibrium of G M D (G) such that the length of its on-path history is O(1/ε) and the terminal node z =[p 0,p 1 ] satisfies sec 0 (G p0,p 1 ) > v 0 nε and sec 1 (G p0,p 1 ) >v 1 nε. Conclusion We have shown that while disarmament with pure strategies is NP-hard, with mixed strategies a type of folk theorem holds and an efficient algorithm exists. Since this is, to our knowledge, the first paper on this topic, there are many directions for future research. These including studying the topic for representations other than the normal form and solution concepts other than Nash equilibrium. One could also consider different types of mixed disarmament, where the mixed strategy space is reduced in a way that is different from putting upper bounds on the probabilities. Of course, our result shows that upper bounds already allow us to attain everything that can reasonably be expected. In any case, as long as the disarmament procedure ensures that the space of remaining mixed strategies stays compact and convex and there exists an efficient separation oracle for the LP to compute the update function f, our results should continue to hold. Another direction is to design algorithms for the pure disarmament case that work well in practice; in the full version of our paper, we give a mixed-integer linear program formulation for this. Finally, we can look for new applications of this framework. 478

7 Acknowledgement We are thankful for support from ARO under grants W911NF and W911NF , NSF under awards IIS and CCF , the Future of Life Institute, and a Guggenheim Fellowship. We also thank Josh Letchford for helpful early discussions. References Anderson, A.; Shoham, Y.; and Altman, A Internal implementation. In Proceedings of the Ninth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Bošanský, B.; Brânzei, S.; Hansen, K. A.; Miltersen, P. B.; and Sørensen, T. B Computation of Stackelberg equilibria of finite sequential games. In Proceedings of the Eleventh Conference on Web and Internet Economics (WINE-15), Brill, M.; Freeman, R.; and Conitzer, V Computing possible and necessary equilibrium actions (and bipartisan set winners). In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Conitzer, V., and Sandholm, T Computing the optimal strategy to commit to. In Proceedings of the ACM Conference on Electronic Commerce (EC), Deng, Y.; Tang, P.; and Zheng, S Complexity and algorithms of K-implementation. In Proceedings of the Fifteenth International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Letchford, J., and Conitzer, V Computing optimal strategies to commit to in extensive-form games. In Proceedings of the ACM Conference on Electronic Commerce (EC), Letchford, J.; Conitzer, V.; and Munagala, K Learning and approximating the optimal strategy to commit to. In Proceedings of the Second Symposium on Algorithmic Game Theory (SAGT), Monderer, D., and Tennenholtz, M K- Implementation. Journal of Artificial Intelligence Research 21: Tambe, M Security and game theory: algorithms, deployed systems, lessons learned. Cambridge University Press. von Neumann, J Zur Theorie der Gesellschaftsspiele. Mathematische Annalen 100: von Stengel, B., and Zamir, S Leadership games with convex strategy sets. Games and Economic Behavior 69:

Optimal Machine Strategies to Commit to in Two-Person Repeated Games

Optimal Machine Strategies to Commit to in Two-Person Repeated Games Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Optimal Machine Strategies to Commit to in Two-Person Repeated Games Song Zuo and Pingzhong Tang Institute for Interdisciplinary

More information

Equilibrium Computation

Equilibrium Computation Equilibrium Computation Ruta Mehta AGT Mentoring Workshop 18 th June, 2018 Q: What outcome to expect? Multiple self-interested agents interacting in the same environment Deciding what to do. Q: What to

More information

Learning Optimal Commitment to Overcome Insecurity

Learning Optimal Commitment to Overcome Insecurity Learning Optimal Commitment to Overcome Insecurity Avrim Blum Carnegie Mellon University avrim@cs.cmu.edu Nika Haghtalab Carnegie Mellon University nika@cmu.edu Ariel D. Procaccia Carnegie Mellon University

More information

Computing the Optimal Game

Computing the Optimal Game Computing the Optimal Game Markus Brill Dept. of Computer Science Duke University Durham, NC, USA brill@cs.duke.edu Rupert Freeman Dept. of Computer Science Duke University Durham, NC, USA rupert@cs.duke.edu

More information

A Compact Representation Scheme for Coalitional Games in Open Anonymous Environments

A Compact Representation Scheme for Coalitional Games in Open Anonymous Environments A Compact Representation Scheme for Coalitional Games in Open Anonymous Environments Naoki Ohta, Atsushi Iwasaki, Makoto Yokoo and Kohki Maruono ISEE, Kyushu University 6-10-1, Hakozaki, Higashi-ku, Fukuoka,

More information

Normal-form games. Vincent Conitzer

Normal-form games. Vincent Conitzer Normal-form games Vincent Conitzer conitzer@cs.duke.edu 2/3 of the average game Everyone writes down a number between 0 and 100 Person closest to 2/3 of the average wins Example: A says 50 B says 10 C

More information

On Equilibria of Distributed Message-Passing Games

On Equilibria of Distributed Message-Passing Games On Equilibria of Distributed Message-Passing Games Concetta Pilotto and K. Mani Chandy California Institute of Technology, Computer Science Department 1200 E. California Blvd. MC 256-80 Pasadena, US {pilotto,mani}@cs.caltech.edu

More information

Introduction to Game Theory

Introduction to Game Theory COMP323 Introduction to Computational Game Theory Introduction to Game Theory Paul G. Spirakis Department of Computer Science University of Liverpool Paul G. Spirakis (U. Liverpool) Introduction to Game

More information

Should Stackelberg Mixed Strategies Be Considered a Separate Solution Concept?

Should Stackelberg Mixed Strategies Be Considered a Separate Solution Concept? Should Stackelberg Mixed Strategies Be Considered a Separate Solution Concept? Vincent Conitzer Duke University Abstract It is sometimes the case that one solution concept in game theory is equivalent

More information

Sequential Equilibrium in Computational Games

Sequential Equilibrium in Computational Games Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Sequential Equilibrium in Computational Games Joseph Y. Halpern and Rafael Pass Cornell University halpern rafael@cs.cornell.edu

More information

Game Theory: Lecture 3

Game Theory: Lecture 3 Game Theory: Lecture 3 Lecturer: Pingzhong Tang Topic: Mixed strategy Scribe: Yuan Deng March 16, 2015 Definition 1 (Mixed strategy). A mixed strategy S i : A i [0, 1] assigns a probability S i ( ) 0 to

More information

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 Daron Acemoglu and Asu Ozdaglar MIT October 14, 2009 1 Introduction Outline Mixed Strategies Existence of Mixed Strategy Nash Equilibrium

More information

Iterated Strict Dominance in Pure Strategies

Iterated Strict Dominance in Pure Strategies Iterated Strict Dominance in Pure Strategies We know that no rational player ever plays strictly dominated strategies. As each player knows that each player is rational, each player knows that his opponents

More information

Learning Optimal Commitment to Overcome Insecurity

Learning Optimal Commitment to Overcome Insecurity Learning Optimal Commitment to Overcome Insecurity Avrim Blum Carnegie Mellon University avrim@cs.cmu.edu Nika Haghtalab Carnegie Mellon University nika@cmu.edu Ariel D. Procaccia Carnegie Mellon University

More information

The Folk Theorem for Finitely Repeated Games with Mixed Strategies

The Folk Theorem for Finitely Repeated Games with Mixed Strategies The Folk Theorem for Finitely Repeated Games with Mixed Strategies Olivier Gossner February 1994 Revised Version Abstract This paper proves a Folk Theorem for finitely repeated games with mixed strategies.

More information

Approximate Nash Equilibria with Near Optimal Social Welfare

Approximate Nash Equilibria with Near Optimal Social Welfare Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 015) Approximate Nash Equilibria with Near Optimal Social Welfare Artur Czumaj, Michail Fasoulakis, Marcin

More information

Lecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about

Lecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about 0368.4170: Cryptography and Game Theory Ran Canetti and Alon Rosen Lecture 7 02 December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about Two-Player zero-sum games (min-max theorem) Mixed

More information

Notes on Coursera s Game Theory

Notes on Coursera s Game Theory Notes on Coursera s Game Theory Manoel Horta Ribeiro Week 01: Introduction and Overview Game theory is about self interested agents interacting within a specific set of rules. Self-Interested Agents have

More information

Computing Minmax; Dominance

Computing Minmax; Dominance Computing Minmax; Dominance CPSC 532A Lecture 5 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 1 Lecture Overview 1 Recap 2 Linear Programming 3 Computational Problems Involving Maxmin 4 Domination

More information

CMU Noncooperative games 4: Stackelberg games. Teacher: Ariel Procaccia

CMU Noncooperative games 4: Stackelberg games. Teacher: Ariel Procaccia CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: Ariel Procaccia A curious game Playing up is a dominant strategy for row player So column player would play left Therefore, (1,1) is the only

More information

Computing Possible and Necessary Equilibrium Actions (and Bipartisan Set Winners)

Computing Possible and Necessary Equilibrium Actions (and Bipartisan Set Winners) Computing Possible and Necessary Equilibrium Actions (and Bipartisan Set Winners) Markus Brill and Rupert Freeman and Vincent Conitzer Department of Computer Science Duke University Durham, NC 27708, USA

More information

General-sum games. I.e., pretend that the opponent is only trying to hurt you. If Column was trying to hurt Row, Column would play Left, so

General-sum games. I.e., pretend that the opponent is only trying to hurt you. If Column was trying to hurt Row, Column would play Left, so General-sum games You could still play a minimax strategy in general- sum games I.e., pretend that the opponent is only trying to hurt you But this is not rational: 0, 0 3, 1 1, 0 2, 1 If Column was trying

More information

Theory Field Examination Game Theory (209A) Jan Question 1 (duopoly games with imperfect information)

Theory Field Examination Game Theory (209A) Jan Question 1 (duopoly games with imperfect information) Theory Field Examination Game Theory (209A) Jan 200 Good luck!!! Question (duopoly games with imperfect information) Consider a duopoly game in which the inverse demand function is linear where it is positive

More information

EC3224 Autumn Lecture #04 Mixed-Strategy Equilibrium

EC3224 Autumn Lecture #04 Mixed-Strategy Equilibrium Reading EC3224 Autumn Lecture #04 Mixed-Strategy Equilibrium Osborne Chapter 4.1 to 4.10 By the end of this week you should be able to: find a mixed strategy Nash Equilibrium of a game explain why mixed

More information

Learning ε-pareto Efficient Solutions With Minimal Knowledge Requirements Using Satisficing

Learning ε-pareto Efficient Solutions With Minimal Knowledge Requirements Using Satisficing Learning ε-pareto Efficient Solutions With Minimal Knowledge Requirements Using Satisficing Jacob W. Crandall and Michael A. Goodrich Computer Science Department Brigham Young University Provo, UT 84602

More information

6.891 Games, Decision, and Computation February 5, Lecture 2

6.891 Games, Decision, and Computation February 5, Lecture 2 6.891 Games, Decision, and Computation February 5, 2015 Lecture 2 Lecturer: Constantinos Daskalakis Scribe: Constantinos Daskalakis We formally define games and the solution concepts overviewed in Lecture

More information

Lecture notes for Analysis of Algorithms : Markov decision processes

Lecture notes for Analysis of Algorithms : Markov decision processes Lecture notes for Analysis of Algorithms : Markov decision processes Lecturer: Thomas Dueholm Hansen June 6, 013 Abstract We give an introduction to infinite-horizon Markov decision processes (MDPs) with

More information

Single parameter FPT-algorithms for non-trivial games

Single parameter FPT-algorithms for non-trivial games Single parameter FPT-algorithms for non-trivial games Author Estivill-Castro, Vladimir, Parsa, Mahdi Published 2011 Journal Title Lecture Notes in Computer science DOI https://doi.org/10.1007/978-3-642-19222-7_13

More information

On Stackelberg Mixed Strategies

On Stackelberg Mixed Strategies Noname manuscript No. (will be inserted by the editor) On Stackelberg Mixed Strategies Vincent Conitzer the date of receipt and acceptance should be inserted later Abstract It is sometimes the case that

More information

arxiv: v1 [cs.gt] 21 May 2017

arxiv: v1 [cs.gt] 21 May 2017 On Stackelberg Mixed Strategies arxiv:1705.07476v1 [cs.gt] 21 May 2017 Vincent Conitzer Duke University Abstract It is sometimes the case that one solution concept in game theory is equivalent to applying

More information

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden 1 Selecting Efficient Correlated Equilibria Through Distributed Learning Jason R. Marden Abstract A learning rule is completely uncoupled if each player s behavior is conditioned only on his own realized

More information

Industrial Organization Lecture 3: Game Theory

Industrial Organization Lecture 3: Game Theory Industrial Organization Lecture 3: Game Theory Nicolas Schutz Nicolas Schutz Game Theory 1 / 43 Introduction Why game theory? In the introductory lecture, we defined Industrial Organization as the economics

More information

New Complexity Results about Nash Equilibria

New Complexity Results about Nash Equilibria New Complexity Results about Nash Equilibria Vincent Conitzer Department of Computer Science & Department of Economics Duke University Durham, NC 27708, USA conitzer@cs.duke.edu Tuomas Sandholm Computer

More information

On the Rate of Convergence of Fictitious Play

On the Rate of Convergence of Fictitious Play Preprint submitted to Theory of Computing Systems June 7, 2011 On the Rate of Convergence of Fictitious Play Felix Brandt Felix Fischer Paul Harrenstein Abstract Fictitious play is a simple learning algorithm

More information

Solving Zero-Sum Security Games in Discretized Spatio-Temporal Domains

Solving Zero-Sum Security Games in Discretized Spatio-Temporal Domains Solving Zero-Sum Security Games in Discretized Spatio-Temporal Domains APPENDIX LP Formulation for Constant Number of Resources (Fang et al. 3) For the sae of completeness, we describe the LP formulation

More information

Conservative Belief and Rationality

Conservative Belief and Rationality Conservative Belief and Rationality Joseph Y. Halpern and Rafael Pass Department of Computer Science Cornell University Ithaca, NY, 14853, U.S.A. e-mail: halpern@cs.cornell.edu, rafael@cs.cornell.edu January

More information

Polynomial-time Computation of Exact Correlated Equilibrium in Compact Games

Polynomial-time Computation of Exact Correlated Equilibrium in Compact Games Polynomial-time Computation of Exact Correlated Equilibrium in Compact Games Albert Xin Jiang Kevin Leyton-Brown Department of Computer Science University of British Columbia Outline 1 Computing Correlated

More information

Learning Equilibrium as a Generalization of Learning to Optimize

Learning Equilibrium as a Generalization of Learning to Optimize Learning Equilibrium as a Generalization of Learning to Optimize Dov Monderer and Moshe Tennenholtz Faculty of Industrial Engineering and Management Technion Israel Institute of Technology Haifa 32000,

More information

Area I: Contract Theory Question (Econ 206)

Area I: Contract Theory Question (Econ 206) Theory Field Exam Summer 2011 Instructions You must complete two of the four areas (the areas being (I) contract theory, (II) game theory A, (III) game theory B, and (IV) psychology & economics). Be sure

More information

Econ 618: Correlated Equilibrium

Econ 618: Correlated Equilibrium Econ 618: Correlated Equilibrium Sunanda Roy 1 Basic Concept of a Correlated Equilibrium MSNE assumes players use a random device privately and independently, that tells them which strategy to choose for

More information

Combinatorial Agency of Threshold Functions

Combinatorial Agency of Threshold Functions Combinatorial Agency of Threshold Functions Shaili Jain 1 and David C. Parkes 2 1 Yale University, New Haven, CT shaili.jain@yale.edu 2 Harvard University, Cambridge, MA parkes@eecs.harvard.edu Abstract.

More information

Solving Extensive Form Games

Solving Extensive Form Games Chapter 8 Solving Extensive Form Games 8.1 The Extensive Form of a Game The extensive form of a game contains the following information: (1) the set of players (2) the order of moves (that is, who moves

More information

Lecture Notes on Game Theory

Lecture Notes on Game Theory Lecture Notes on Game Theory Levent Koçkesen Strategic Form Games In this part we will analyze games in which the players choose their actions simultaneously (or without the knowledge of other players

More information

The Computational Aspect of Risk in Playing Non-Cooperative Games

The Computational Aspect of Risk in Playing Non-Cooperative Games The Computational Aspect of Risk in Playing Non-Cooperative Games Deeparnab Chakrabarty Subhash A. Khot Richard J. Lipton College of Computing, Georgia Tech Nisheeth K. Vishnoi Computer Science Division,

More information

Microeconomics. 2. Game Theory

Microeconomics. 2. Game Theory Microeconomics 2. Game Theory Alex Gershkov http://www.econ2.uni-bonn.de/gershkov/gershkov.htm 18. November 2008 1 / 36 Dynamic games Time permitting we will cover 2.a Describing a game in extensive form

More information

Approximation Algorithms and Mechanism Design for Minimax Approval Voting

Approximation Algorithms and Mechanism Design for Minimax Approval Voting Approximation Algorithms and Mechanism Design for Minimax Approval Voting Ioannis Caragiannis RACTI & Department of Computer Engineering and Informatics University of Patras, Greece caragian@ceid.upatras.gr

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

First Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo

First Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo Game Theory Giorgio Fagiolo giorgio.fagiolo@univr.it https://mail.sssup.it/ fagiolo/welcome.html Academic Year 2005-2006 University of Verona Summary 1. Why Game Theory? 2. Cooperative vs. Noncooperative

More information

A Polynomial-time Nash Equilibrium Algorithm for Repeated Games

A Polynomial-time Nash Equilibrium Algorithm for Repeated Games A Polynomial-time Nash Equilibrium Algorithm for Repeated Games Michael L. Littman mlittman@cs.rutgers.edu Rutgers University Peter Stone pstone@cs.utexas.edu The University of Texas at Austin Main Result

More information

Price of Stability in Survivable Network Design

Price of Stability in Survivable Network Design Noname manuscript No. (will be inserted by the editor) Price of Stability in Survivable Network Design Elliot Anshelevich Bugra Caskurlu Received: January 2010 / Accepted: Abstract We study the survivable

More information

Towards a General Theory of Non-Cooperative Computation

Towards a General Theory of Non-Cooperative Computation Towards a General Theory of Non-Cooperative Computation (Extended Abstract) Robert McGrew, Ryan Porter, and Yoav Shoham Stanford University {bmcgrew,rwporter,shoham}@cs.stanford.edu Abstract We generalize

More information

Computing Minmax; Dominance

Computing Minmax; Dominance Computing Minmax; Dominance CPSC 532A Lecture 5 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 1 Lecture Overview 1 Recap 2 Linear Programming 3 Computational Problems Involving Maxmin 4 Domination

More information

Convergence to Pareto Optimality in General Sum Games via Learning Opponent s Preference

Convergence to Pareto Optimality in General Sum Games via Learning Opponent s Preference Convergence to Pareto Optimality in General Sum Games via Learning Opponent s Preference Dipyaman Banerjee Department of Math & CS University of Tulsa Tulsa, OK, USA dipyaman@gmail.com Sandip Sen Department

More information

SF2972 Game Theory Exam with Solutions March 15, 2013

SF2972 Game Theory Exam with Solutions March 15, 2013 SF2972 Game Theory Exam with s March 5, 203 Part A Classical Game Theory Jörgen Weibull and Mark Voorneveld. (a) What are N, S and u in the definition of a finite normal-form (or, equivalently, strategic-form)

More information

Sequential Bidding in the Bailey-Cavallo Mechanism

Sequential Bidding in the Bailey-Cavallo Mechanism Sequential Bidding in the Bailey-Cavallo Mechanism Krzysztof R. Apt 1,2 and Evangelos Markakis 2 1 CWI, Science Park 123, 1098 XG Amsterdam 2 Institute of Logic, Language and Computation, University of

More information

6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3

6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3 6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3 Daron Acemoglu and Asu Ozdaglar MIT October 19, 2009 1 Introduction Outline Existence of Nash Equilibrium in Infinite Games Extensive Form

More information

6.207/14.15: Networks Lecture 16: Cooperation and Trust in Networks

6.207/14.15: Networks Lecture 16: Cooperation and Trust in Networks 6.207/14.15: Networks Lecture 16: Cooperation and Trust in Networks Daron Acemoglu and Asu Ozdaglar MIT November 4, 2009 1 Introduction Outline The role of networks in cooperation A model of social norms

More information

ENDOGENOUS REPUTATION IN REPEATED GAMES

ENDOGENOUS REPUTATION IN REPEATED GAMES ENDOGENOUS REPUTATION IN REPEATED GAMES PRISCILLA T. Y. MAN Abstract. Reputation is often modelled by a small but positive prior probability that a player is a behavioral type in repeated games. This paper

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo September 6, 2011 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

Basic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria

Basic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria Basic Game Theory University of Waterloo January 7, 2013 Outline 1 2 3 What is game theory? The study of games! Bluffing in poker What move to make in chess How to play Rock-Scissors-Paper Also study of

More information

Belief-based Learning

Belief-based Learning Belief-based Learning Algorithmic Game Theory Marcello Restelli Lecture Outline Introdutcion to multi-agent learning Belief-based learning Cournot adjustment Fictitious play Bayesian learning Equilibrium

More information

Game Theory. Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin

Game Theory. Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin Game Theory Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin Bimatrix Games We are given two real m n matrices A = (a ij ), B = (b ij

More information

1 More finite deterministic automata

1 More finite deterministic automata CS 125 Section #6 Finite automata October 18, 2016 1 More finite deterministic automata Exercise. Consider the following game with two players: Repeatedly flip a coin. On heads, player 1 gets a point.

More information

CS364A: Algorithmic Game Theory Lecture #16: Best-Response Dynamics

CS364A: Algorithmic Game Theory Lecture #16: Best-Response Dynamics CS364A: Algorithmic Game Theory Lecture #16: Best-Response Dynamics Tim Roughgarden November 13, 2013 1 Do Players Learn Equilibria? In this lecture we segue into the third part of the course, which studies

More information

Quantum Games Have No News for Economists 1

Quantum Games Have No News for Economists 1 Quantum Games Have No News for Economists 1 By David K. Levine 2 This version: September 1, 2005 First version: December 3, 2005 Quantum computing offers the possibility of massively parallel computing

More information

REPEATED GAMES. Jörgen Weibull. April 13, 2010

REPEATED GAMES. Jörgen Weibull. April 13, 2010 REPEATED GAMES Jörgen Weibull April 13, 2010 Q1: Can repetition induce cooperation? Peace and war Oligopolistic collusion Cooperation in the tragedy of the commons Q2: Can a game be repeated? Game protocols

More information

On Reputation with Imperfect Monitoring

On Reputation with Imperfect Monitoring On Reputation with Imperfect Monitoring M. W. Cripps, G. Mailath, L. Samuelson UCL, Northwestern, Pennsylvania, Yale Theory Workshop Reputation Effects or Equilibrium Robustness Reputation Effects: Kreps,

More information

The Distribution of Optimal Strategies in Symmetric Zero-sum Games

The Distribution of Optimal Strategies in Symmetric Zero-sum Games The Distribution of Optimal Strategies in Symmetric Zero-sum Games Florian Brandl Technische Universität München brandlfl@in.tum.de Given a skew-symmetric matrix, the corresponding two-player symmetric

More information

: Cryptography and Game Theory Ran Canetti and Alon Rosen. Lecture 8

: Cryptography and Game Theory Ran Canetti and Alon Rosen. Lecture 8 0368.4170: Cryptography and Game Theory Ran Canetti and Alon Rosen Lecture 8 December 9, 2009 Scribe: Naama Ben-Aroya Last Week 2 player zero-sum games (min-max) Mixed NE (existence, complexity) ɛ-ne Correlated

More information

On the Hardness and Existence of Quasi-Strict Equilibria

On the Hardness and Existence of Quasi-Strict Equilibria On the Hardness and Existence of Quasi-Strict Equilibria Felix Brandt and Felix Fischer Institut für Informatik, Universität München 80538 München, Germany {brandtf,fischerf}@tcs.ifi.lmu.de Abstract. This

More information

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. Biased Games

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. Biased Games Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Biased Games Ioannis Caragiannis University of Patras caragian@ceid.upatras.gr David Kurokawa Carnegie Mellon University dkurokaw@cs.cmu.edu

More information

Patience and Ultimatum in Bargaining

Patience and Ultimatum in Bargaining Patience and Ultimatum in Bargaining Björn Segendorff Department of Economics Stockholm School of Economics PO Box 6501 SE-113 83STOCKHOLM SWEDEN SSE/EFI Working Paper Series in Economics and Finance No

More information

Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples

Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples February 24, 2011 Summary: We introduce the Nash Equilibrium: an outcome (action profile) which is stable in the sense that no player

More information

Computational Game Theory Spring Semester, 2005/6. Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg*

Computational Game Theory Spring Semester, 2005/6. Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg* Computational Game Theory Spring Semester, 2005/6 Lecture 5: 2-Player Zero Sum Games Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg* 1 5.1 2-Player Zero Sum Games In this lecture

More information

Computational Aspects of Shapley s Saddles

Computational Aspects of Shapley s Saddles Computational Aspects of Shapley s Saddles Felix Brandt Markus Brill Felix Fischer Paul Harrenstein Institut für Informatik Universität München 8538 München, Germany {brandtf,brill,fischerf,harrenst}@tcs.ifi.lmu.de

More information

Bargaining Efficiency and the Repeated Prisoners Dilemma. Bhaskar Chakravorti* and John Conley**

Bargaining Efficiency and the Repeated Prisoners Dilemma. Bhaskar Chakravorti* and John Conley** Bargaining Efficiency and the Repeated Prisoners Dilemma Bhaskar Chakravorti* and John Conley** Published as: Bhaskar Chakravorti and John P. Conley (2004) Bargaining Efficiency and the repeated Prisoners

More information

Algorithms, Games, and Networks January 17, Lecture 2

Algorithms, Games, and Networks January 17, Lecture 2 Algorithms, Games, and Networks January 17, 2013 Lecturer: Avrim Blum Lecture 2 Scribe: Aleksandr Kazachkov 1 Readings for today s lecture Today s topic is online learning, regret minimization, and minimax

More information

A Technique for Reducing Normal-Form Games to Compute a Nash Equilibrium

A Technique for Reducing Normal-Form Games to Compute a Nash Equilibrium A Technique for Reducing Normal-Form Games to Compute a Nash Equilibrium Vincent Conitzer Carnegie Mellon University Computer Science Department 5000 Forbes Avenue Pittsburgh, PA 15213, USA conitzer@cscmuedu

More information

Wars of Attrition with Budget Constraints

Wars of Attrition with Budget Constraints Wars of Attrition with Budget Constraints Gagan Ghosh Bingchao Huangfu Heng Liu October 19, 2017 (PRELIMINARY AND INCOMPLETE: COMMENTS WELCOME) Abstract We study wars of attrition between two bidders who

More information

Role Assignment for Game-Theoretic Cooperation

Role Assignment for Game-Theoretic Cooperation Role Assignment for Game-Theoretic Cooperation Catherine Moon and Vincent Conitzer Duke University, Durham, NC 27708, USA Abstract. In multiagent systems, often agents need to be assigned to different

More information

Chapter 9. Mixed Extensions. 9.1 Mixed strategies

Chapter 9. Mixed Extensions. 9.1 Mixed strategies Chapter 9 Mixed Extensions We now study a special case of infinite strategic games that are obtained in a canonic way from the finite games, by allowing mixed strategies. Below [0, 1] stands for the real

More information

A Note on Approximate Nash Equilibria

A Note on Approximate Nash Equilibria A Note on Approximate Nash Equilibria Constantinos Daskalakis, Aranyak Mehta, and Christos Papadimitriou University of California, Berkeley, USA. Supported by NSF grant CCF-05559 IBM Almaden Research Center,

More information

Learning to Coordinate Efficiently: A Model-based Approach

Learning to Coordinate Efficiently: A Model-based Approach Journal of Artificial Intelligence Research 19 (2003) 11-23 Submitted 10/02; published 7/03 Learning to Coordinate Efficiently: A Model-based Approach Ronen I. Brafman Computer Science Department Ben-Gurion

More information

The ambiguous impact of contracts on competition in the electricity market Yves Smeers

The ambiguous impact of contracts on competition in the electricity market Yves Smeers The ambiguous impact of contracts on competition in the electricity market Yves Smeers joint work with Frederic Murphy Climate Policy and Long Term Decisions-Investment and R&D, Bocconi University, Milan,

More information

Computation of Efficient Nash Equilibria for experimental economic games

Computation of Efficient Nash Equilibria for experimental economic games International Journal of Mathematics and Soft Computing Vol.5, No.2 (2015), 197-212. ISSN Print : 2249-3328 ISSN Online: 2319-5215 Computation of Efficient Nash Equilibria for experimental economic games

More information

1 Basic Game Modelling

1 Basic Game Modelling Max-Planck-Institut für Informatik, Winter 2017 Advanced Topic Course Algorithmic Game Theory, Mechanism Design & Computational Economics Lecturer: CHEUNG, Yun Kuen (Marco) Lecture 1: Basic Game Modelling,

More information

6.254 : Game Theory with Engineering Applications Lecture 8: Supermodular and Potential Games

6.254 : Game Theory with Engineering Applications Lecture 8: Supermodular and Potential Games 6.254 : Game Theory with Engineering Applications Lecture 8: Supermodular and Asu Ozdaglar MIT March 2, 2010 1 Introduction Outline Review of Supermodular Games Reading: Fudenberg and Tirole, Section 12.3.

More information

Repeated Downsian Electoral Competition

Repeated Downsian Electoral Competition Repeated Downsian Electoral Competition John Duggan Department of Political Science and Department of Economics University of Rochester Mark Fey Department of Political Science University of Rochester

More information

Computing Proper Equilibria of Zero-Sum Games

Computing Proper Equilibria of Zero-Sum Games Computing Proper Equilibria of Zero-Sum Games Peter Bro Miltersen and Troels Bjerre Sørensen University of Aarhus, Denmark. Abstract. We show that a proper equilibrium of a matrix game can be found in

More information

Learning Near-Pareto-Optimal Conventions in Polynomial Time

Learning Near-Pareto-Optimal Conventions in Polynomial Time Learning Near-Pareto-Optimal Conventions in Polynomial Time Xiaofeng Wang ECE Department Carnegie Mellon University Pittsburgh, PA 15213 xiaofeng@andrew.cmu.edu Tuomas Sandholm CS Department Carnegie Mellon

More information

Self-stabilizing uncoupled dynamics

Self-stabilizing uncoupled dynamics Self-stabilizing uncoupled dynamics Aaron D. Jaggard 1, Neil Lutz 2, Michael Schapira 3, and Rebecca N. Wright 4 1 U.S. Naval Research Laboratory, Washington, DC 20375, USA. aaron.jaggard@nrl.navy.mil

More information

Lectures 6, 7 and part of 8

Lectures 6, 7 and part of 8 Lectures 6, 7 and part of 8 Uriel Feige April 26, May 3, May 10, 2015 1 Linear programming duality 1.1 The diet problem revisited Recall the diet problem from Lecture 1. There are n foods, m nutrients,

More information

Political Economy of Institutions and Development: Problem Set 1. Due Date: Thursday, February 23, in class.

Political Economy of Institutions and Development: Problem Set 1. Due Date: Thursday, February 23, in class. Political Economy of Institutions and Development: 14.773 Problem Set 1 Due Date: Thursday, February 23, in class. Answer Questions 1-3. handed in. The other two questions are for practice and are not

More information

Realization Plans for Extensive Form Games without Perfect Recall

Realization Plans for Extensive Form Games without Perfect Recall Realization Plans for Extensive Form Games without Perfect Recall Richard E. Stearns Department of Computer Science University at Albany - SUNY Albany, NY 12222 April 13, 2015 Abstract Given a game in

More information

Exact and Approximate Equilibria for Optimal Group Network Formation

Exact and Approximate Equilibria for Optimal Group Network Formation Noname manuscript No. will be inserted by the editor) Exact and Approximate Equilibria for Optimal Group Network Formation Elliot Anshelevich Bugra Caskurlu Received: December 2009 / Accepted: Abstract

More information

Ex Post Cheap Talk : Value of Information and Value of Signals

Ex Post Cheap Talk : Value of Information and Value of Signals Ex Post Cheap Talk : Value of Information and Value of Signals Liping Tang Carnegie Mellon University, Pittsburgh PA 15213, USA Abstract. Crawford and Sobel s Cheap Talk model [1] describes an information

More information

Distributed Optimization. Song Chong EE, KAIST

Distributed Optimization. Song Chong EE, KAIST Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links

More information

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs LP-Duality ( Approximation Algorithms by V. Vazirani, Chapter 12) - Well-characterized problems, min-max relations, approximate certificates - LP problems in the standard form, primal and dual linear programs

More information

Optimal Efficient Learning Equilibrium: Imperfect Monitoring in Symmetric Games

Optimal Efficient Learning Equilibrium: Imperfect Monitoring in Symmetric Games Optimal Efficient Learning Equilibrium: Imperfect Monitoring in Symmetric Games Ronen I. Brafman Department of Computer Science Stanford University Stanford, CA 94305 brafman@cs.stanford.edu Moshe Tennenholtz

More information

On the Complexity of Two-Player Win-Lose Games

On the Complexity of Two-Player Win-Lose Games On the Complexity of Two-Player Win-Lose Games Tim Abbott, Daniel Kane, Paul Valiant April 7, 2005 Abstract The efficient computation of Nash equilibria is one of the most formidable computational-complexity

More information