Multi-sided pre-play communication by burning money Sjaak Hurkens y CentER, Tilburg University P.O. Box 90153 5000 LE Tilburg, The Netherlands November 1994 Abstract We investigate the consequences of allowing some players to send a costly message before a game is played. Since messages have no literal meaning sending costly messages is also called `burning money'. We consider a setting with n players among which k have the possibility to burn money. We show that strategy proles in sets that are closed under rational behavior, yield all players that have the possibility to burn, their preferred outcome. Moreover, in such equilibria no money is actually burnt, the possibility alone suces. For the special case with two players all results go through for persistent retracts as well. Journal of Economic Literature Classication Number: C72. Copyright c1996 by Academic Press. This material has been published in Journal of Economic Theory, 69, 186-197, the only denitive repository of the content that has been certied and accepted after peer review. Copyright and all rights therein are retained by Academic Press. This material may not be copied or reposted without explicit permission. ythe author thanks Eric van Damme for many helpful discussions, and an anonymous referee and an associate editor for helpful comments. This research was sponsored by the Foundation for the Promotion of Research in Economic Sciences, which is part of the Netherlands Organization for Scientic Research (NWO). 1
1 Introduction Given a game in normal form, Ben-Porath and Dekel [4] investigate the consequences of allowing some players to signal future actions by incurring costs before the game is played. They consider equilibria that survive repeated elimination of weakly dominated strategies. Their main result is that, in a certain class of two person games, if only one player can signal, then repeated elimination of weakly dominated strategies selects her most preferred outcome. Moreover, the player does not have to incur any cost to achieve this. While this result is nice, it has some important limitations. First of all, Ben-Porath and Dekel consider only two player games. As we will show later by example, repeated elimination of weakly dominated strategies need not work in games with more than two players. Furthermore, they show that if both players have the opportunity to signal (simultaneously), then signaling future actions is not possible, not even if the game has common interests. Finally, they need that a player can burn a considerable amount of money. We consider a more general case. We extend n-person games by allowing k of the players to signal future actions by incurring costs. In order to obtain results similar to those of Ben-Porath and Dekel we work with a stronger solution concept. Van Damme [6] showed that stable equilibria do not necessarily lead to ecient outcomes. We show that, if the notion of curb or curb* retracts (Basu and Weibull [3]) is used, then signaling future actions is possible. In fact, our result is stronger than that, since even giving a dummy player the opportunity to burn money can inuence the outcome. Roughly speaking, if the players that can burn money have common interests in the underlying game, then curb and curb* select their preferred outcome. Furthermore, in all models of pre-play communication, no costs are actually incurred. Moreover, if there are two players then the notion of persistence (Kalai and Samet [8]) gives the same results. In the next section we will consider a simple example of a two person game where only one player can signal. From this example the reader can develop some feeling for the solution concept we employ. Moreover, this example shows a dierence between our 2
approach and that of Ben-Porath and Dekel's. At the same time it shows that it is important that messages are costly. In Section 3 the formal model and the notions of curb and curb* retracts are dened. In Section 4 we prove the theorem for these concepts for the general case. In Section 5 we consider persistent retracts in the special case of two person games. Moreover, it is shown that in games with more than two players persistence need not work. 2 An example: the battle of the sexes Consider the battle of the sexes game represented by the normal form in Figure 1. The woman (the row player) prefers to go to a soccer match (`S'), the man (the column player) prefers to go to the theater (`t'). s t S 9,5 0,4 T 4,4 6,7 Figure 1: the battle of the sexes. Suppose the woman can send one of two messages, m 0 or m 1. Message m 0 is costless and message m 1 costs c. Later we will consider the cases c = 0, c = 1 and c = 2. Let m i E denote the woman's strategy when she sends m i and visits event E. Let e 0 e 1 denote the man's strategy when he goes to event e i if he receives message m i. Then the game with one-sided pre-play communication can be represented by the normal form in Figure 2. ss st ts tt m 0 S 9,5 9,5 0,4 0,4 m 0 T 4,4 4,4 6,7 6,7 m 1 S 9? c; 5?c; 4 9? c; 5?c; 4 m 1 T 4? c; 4 6? c; 7 4? c; 4 6? c; 7 Figure 2. First we will consider Ben-Porath and Dekel's approach. Consider the case c = 2. Starting with deleting action m 1 T, we nd that repeated elimination of weakly domi- 3
nated strategies yields the equilibrium (m 0 S; ss). Hence, repeated elimination of weakly dominated strategies selects the outcome where both go to the soccer match and no money is actually being burnt. Notice that this reasoning does not go through if c = 0 or if c = 1. In these cases no player has a dominated strategy. For example, if c = 1, m 1 T can only be dominated by a strategy that puts at least weight 5 on 6 m0 T, and at least weight 1 on 5 m0 S. We will look at minimal sets of strategies that are closed under best responses. (Basu and Weibull [3] call this curb: closed under rational behavior.) A set of strategy proles is closed under best responses if with every strategy prole, all its best replies are contained in the set. Now look at our example. Suppose R 1 R 2 is closed under best responses. It is easy to check that we have the following implications (for all values of c under consideration). m 1 T 2 R 1 ) tt 2 R 2 ) m 0 T 2 R 1 ) ts 2 R 2 ) m 1 S 2 R 1 ) ss 2 R 2 ) m 0 S 2 R 1 ) ss; st 2 R 2 In the cases with c > 0, the set of strategy proles where the woman plays m 0 S and the man mixes between ss and st is closed under best responses. implications shows that this set is the unique minimal one. If c = 0 the series of implications continues. The series of ss; st 2 R 2 ) m 1 S 2 R 1 ) ts 2 R 2 Since m 0 T and m 1 T are best replies against 1 st + 1 ts, we conclude that the only set 2 2 that is closed under best responses is the set of all strategy proles. Hence, if talk is costless, then on the basis of curb no sharp predictions can be made in this game. 1 3 The model In this section we will formally set up the model of multi-sided pre-play communication in n-person games. Before we do that we recall the notions of curb and curb* (Basu and Weibull [3]). 4
We denote the set of mixed strategies of player i by i. For a set of pure strategies X let co(x) denote the convex hull of this set, which is equivalent to the set of all probability distributions over X. If is a strategy prole then?i denotes the prole of all players besides player i. The set of best replies of player i against?i is denoted by BR i (?i ). Dene BR() = Q n i=1 BR i (?i ). UBR() denotes the set of best replies against that are not weakly dominated. A retract is a cartesian product R = Q n i=1 R i, where R i i is non-empty, convex and compact. Denition 1. A retract R is said to have the curb property if for all 2 R, BR() R. A retract R is said to have the curb* property if for all 2 R, UBR() R. A retract which is minimal with respect to the curb (curb*) property is called a curb (curb*) retract. The minimality condition is imposed since the set of all strategy proles trivially satises the curb and curb* properties. A curb retract can be viewed as a set-valued generalization of a strict equilibrium, or as a self-enforcing agreement. If an outsider recommends all players to play strategies from some curb retract R, each player should follow this recommendation, whenever he expects the other players to do so. Even when there is no outsider to give such a recommendation, players may learn to play curb strategies (see Hurkens [7] or Appendix I for a summary of [7]). The comparison with strict equilibria is not completely valid since curb retracts may contain weakly dominated strategies. Curb* retracts do not contain pure weakly dominated strategies. The two dierent concepts are closely related and have some nice properties that are summarized in the following lemma. Lemma 1. Every curb retract contains a curb* retract. Every curb (curb*) retract is a cartesian product of convex hulls of pure strategies. The intersection of two retracts with the curb (curb*) property is either empty or it has the curb (curb*) property, hence dierent curb (curb*) retracts are disjoint. Proof. Easy and therefore omitted. 2 Let G = (S 1 ; : : : ; S n ; u 1 ; : : : ; u n ) be an n-person game with player set N. We split up the set of players into C and D. C is the set of players that can send a message 5
in the pre-play communication stage (communicating players); D is the set of players that cannot (dumb players). Note that the players in D are dumb but not deaf. It is important that they can hear. We assume that C is not empty. D may be empty. We will assume Assumption 1. There exists s 2 S such that u i (s ) > u i (s), for all i 2 C and all s 6= s and such that u j (s ) > u j (s?j ; s j) for all j 2 D and all s j 6= s j. Notice that if C = N then G is a common interest game (Aumann and Sorin [1]). Assumption 1 says that the game has a strict Nash equilibrium that gives all communicating players their highest payo. We assume that all communicating players have the same set of messages M = fm 0 ; m 1 ; : : : ; m L g at their disposal and that the cost of sending message m p is c(m p ) for all of them. These assumptions are made for notational convenience only. They do not play a role in the results we will derive. We assume that c(m 0 ) = 0, c(m 1 ) > 0 is `small' and c(m p ) > c(m 1 ) for all p > 1. In fact, c(m 1 ) is so small that c(m 1 ) < u i (s )? u i (s) for all i 2 C and all s 6= s This implies that any communicating player who can induce the play of s by sending message m 1, will do so, unless sending m 0 induces the play of s also. We denote the game with pre-play communication by P C (G) = (T 1 ; : : : ; T n ; v 1 ; : : : ; v n ), where T i = fm i f i jm i 2 M and f i : M Cnfig! S i g (i 2 C) T j = ff j jf j : M C! S j g (j 2 D) (We use the following convention if C = fig: f i : M ;! S i corresponds to a single element of S i.) With some abuse of notation we have for mf 2 T = Q n l=1 T l, v i (mf) = u i (f(m))? c(m i ) (i 2 C) v j (mf) = u j (f(m)) (j 2 D) Remark: The way we have dened the strategy space here means that we are looking at the reduced normal form (as did Ben-Porath and Dekel). In the normal form a communicating player's strategy also depends on his own message. It does not matter 6
for our results whether we analyze the normal form or the reduced normal form. However, in the normal form the communicating players have a lot of equivalent strategies and this makes the proof for persistence quite tedious. 4 Results for the general case In this section we will prove the theorem for the curb and curb* retracts. We will deal with the general case with n players among which the players in C can communicate. Theorem 1. P C (G) has a unique curb (curb*) retract, and every strategy prole in this retract yields player l u l (s ) (l 2 N). Proof. We will denote a vector with all coordinates equal to m 0 by m d0. (Depending on whether this vector is in the domain of a dumb or of a communicating player, it has #C or #C? 1 coordinates. No confusion will result.) Dene F = fmf 2 T jm = d m 0 ; f l ( d m 0 ) = s l for all l 2 Ng Notice that F consists of all pure strategy proles that yield the payo vector u(s ). We will show that every curb* retract has a non-empty intersection with F. If this is the case then every curb* retract is contained in R = Q n l=1 co(f l ), since R has the curb* property. Since every curb retract contains a curb* retract, we know then that every curb retract has a non-empty intersection with R. Since R has the curb property itself, any curb retract is contained in R. Since all strategy proles in R yield the payo u(s ), it follows that R is the unique curb retract, and that the unique curb* retract is obtained from R by deleting every strategy that has pure weakly dominated strategies in its support. Then the proof has been completed. Let R be a curb* retract and let m f 2 R n F. Now player l has a lot of undominated best replies against ( m f)?l. This is so because a player has a lot of freedom in how to react on tuples of messages that he does not receive. In particular there exists m 0 f 0 2 R such that (i) for all i 2 C and all m?i 6= m?i, f 0 i(m?i ) = s i, and (ii) for all j 2 D and all m 6= m, f 0 j(m) = s j. If m 0 f 0 2 F we are ready, because then F \ R 6= ;. 7
So suppose m 0 f 0 62 F. Case 1: m 6= d m 0. There exists i 2 C with m i 6= m 0. All best replies for i against m 0 f 0 involve sending m 0. In particular, if f 00 i (m) = s i for all m, then m 0 f 00 i is an undominated best reply to m 0 f 0 and hence contained in R i. Now for j 2 C, j 6= i, all best replies against z = (m 0 f 00 i ; m0 f 0?i) involve sending m 0. In particular, if f 00 j (m) = s j for all m, then m 0 f 00 j is an undominated best reply to z and hence contained in R j. Thus F \R 6= ;. Case 2: m = d m 0. Let i 2 C. All best replies for i against m 0 f 0 involve sending m 1. Hence, there exists m f 2 R n F with m 6= d m 0. This brings us back to case 1. 2 5 Further results for two player games Let us recall the notion of persistent retracts (Kalai and Samet [8]). Denition 2. A retract R is called absorbing if there exists an open neighborhood U R such that for all 2 U, we have that BR() \ R 6= ;. An absorbing retract is called persistent if it does not properly contain an absorbing retract. For an elaborate discussion of the relation between persistent retracts and other solution concepts the reader is referred to Balkenborg [2]. Two interesting facts on persistent retracts we will state here, without proofs. Every curb* retract contains a persistent retract (Balkenborg [2]) and every persistent retract contains a Nash equilibrium (Kalai and Samet [8]). Let us state the following lemma which will prove to be useful in the two player case. The proof is relegated to Appendix II. Lemma 2. Let G = (S 1 ; S 2 ; u 1 ; u 2 ) be a game. For F = F 1 F 2 S 1 S 2, let H i = S i n F i. If (1) for all f; f 0 2 F, u i (f) = u i (f 0 ), (2) for all f 2 F and h i 2 H i, u i (h i ; f j ) < u i (f), and (3) every persistent retract has a non-empty intersection with F, then every persistent retract is contained in R = co(f1 ) co(f 2 ). From now on we will assume 8
Assumption 2. No player has equivalent strategies in the game with pre-play communication. Then we have that every persistent retract of this game is a cartesian product of convex hulls of pure strategies. For a proof see Kalai and Samet [8]. 5.1 One-sided pre-play communication First we will consider the case with one-sided pre-play communication, hence C = f1g. Assumption 1 is now equivalent to saying that the underlying game has a strict Nash equilibrium yielding player 1 a higher payo than any other strategy prole. Theorem 2. Every strategy prole in a persistent retract of P f1g (G) yields player l u l (s ) (l 2 N). Proof. Let R be a persistent retract and let ms 1 2 R 1. Case 1: m 6= m 0. Then there exists g 2 R 2 with g(m 0 ) = s 2. This is so because all best replies against (1?)ms 1 + m 0 s 1 have this property. This implies that m 0 s 1 2 R 1, since it is the unique best reply against g. Case 2: m = m 0 and s 1 6= s 1. Using the same trick as before we see that there exists g 2 R 2 with g(m 1 ) = s 2. If g(m 0 ) = s 2, then m 0 s 1 2 R 1. If g(m 0 ) 6= s 2, then m 1 s 1 2 R 1. This brings us back to case 1. Hence, m 0 s 2 1 R 1. It is easy to check that the set of extreme points of F = fm 0 s 1g co(fg : M! S 2 jg(m 0 ) = s 2g) satises assumptions (1), (2) and (3) of Lemma 2. Then it follows from Lemma 2 that R F. Notice that every strategy prole in F yields player l u l (s ) and the theorem has been proved. 2 5.2 Two-sided pre-play communication Now we turn our attention to the two-sided pre-play communication game, i.e. C = f1; 2g. Recall that Assumption 1 now says that the underlying game has common interests. 9
Theorem 3. Every strategy prole in a persistent retract of P C (G) yields player l a payo of u l (s ) (l 2 N). Proof. Let R be a persistent retract and let mf 2 R i. Let j = 3? i. Case 1: m 6= m 0. Then there exists m 0 g 2 R j m 0 f 0 2 R i with f 0 (m 0 ) = s i. with g(m 0 ) = s j. Hence, there exists Case 2: m = m 0 and f(m 0 ) 6= s i. Then there exists m0 g 2 R j with g(m 1 ) = s j. If g(m 0 ) = s j, then there exists m 0 f 0 2 R i with f 0 (m 0 ) = s i. If g(m 0 ) 6= s j, then there exists m 1 f 00 2 R i. This brings us back to case 1. Let F i = fm 0 fjf(m 0 ) = s i g. The two cases showed that F i \ R i 6= ;. It is easy to check that F 1 and F 2 satisfy assumptions (1), (2) and (3) of Lemma 2. Hence, any persistent retract is contained in co(f 1 ) co(f 2 ). Notice that any strategy prole in this set yields player l u l (s ) and the theorem has been proved. 2 5.3 A three player counterexample The following example shows that persistence need not work in games with more than two players. Example. Let n = 3, C = f3g. Player 1 chooses between rows R 1 and R 2, player 2 chooses between columns C 1 and C 2 and player 3 chooses A. (Player 3 has only one action.) The payos are given in Figure 3. C 1 C 2 R 1 2,2,2 0,0,0 R 2 0,0,0 1,1,1 Figure 3. Now consider the following strategy prole: players 1 and 2 play their second strategy after all messages and player 3 sends the costless message and then `chooses' A. This yields the strictly dominated payo vector (1; 1; 1). Nevertheless, this strategy prole is a persistent retract as a singleton and hence a persistent equilibrium. In a small neighborhood of the retract player 3 has a unique best reply. Players 1 and 2 have a lot of best replies against the retract, but in a small neighborhood outside the retract they 10
have a unique best reply. This is due to the fact that players 1 and 2 have an interest in choosing the same action: in a small neighborhood player 1 plays R 2 with a very high probability, after any message, and hence player 2 has to choose C 2, after any message. Repeated elimination of weakly dominated strategies does not work either. In this example the messages are not important to signal the intended action of player 3, since he has only one action. Hence, our results are not related to forward induction. The reason that the dummy player has an eect on the outcome (when curb or curb* is applied), is that the other players may change their reply to (currently) unsent messages. The dummy player is willing to try a dierent (and possibly more costly) message, whenever the rst two players fail to coordinate on the ecient equilibrium. 2 Appendix I In Hurkens [7], we construct a dynamic learning process to support the concepts of curb retracts. We briey summarize that paper. Roughly speaking, the learning evolves as follows: A particular game is played at discrete points in time. For each role in this game there is a pool of players. At the beginning of each period one player is drawn from each pool. These players will play the game in that period. Players have a bounded memory. On the basis of strategies played in the recent past, they form expectations about the strategies the other players will use, and best respond to these expectations. It is assumed that dierent players within the same pool may have dierent beliefs and therefore they may choose dierent actions. It is shown that, if the memory is long enough, play will settle down in a minimal curb set. The intuition for this result is quite clear. By having a large enough memory, players may have beliefs with large supports. This means that best replies against all kinds of mixtures will be played now and then. This creates so much stochastic variability that players sooner or later will play curb strategies. When they keep drawing the \right" samples, they will keep best responding against curb strategies, and hence they will play curb strategies again. It might happen that they will do this many periods in a row. The probability that this happens at a specic point in time is only small, but with 11
probability one it will happen eventually. By that time all non-curb strategies will be forgotten. The strategies that will be played from that point on, will depend on the sample drawn, but it is sure that it will be curb strategies again. Appendix II Proof of Lemma 2. Let R be a persistent retract, and let V 1 V 2 R be an open neighborhood that is absorbed by R. Let x i = u i (f) for (all) f 2 F and dene m + i = max u i (s 1 ; s 2 ) m? i = min u i (s 1 ; s 2 ) (h 1 ) = x 1? max f 2 2F 2 u 1 (h 1 ; f 2 ) (h 1 2 H 1 ) (h 2 ) = x 2? max f 1 2F 1 u 2 (f 1 ; h 2 ) (h 2 2 H 2 ) Notice that (2) implies (h i ) > 0. Let i = min hi 2H i (h i ). Let j > 0 (j = 3? i) satisfy (1? j )(x i? i ) + j m + i < (1? j )x i + j m? i Let i (s i ) denote the weight that i puts on s i. Dene V i ( i ) = f i 2 V i j P h i 2H i i (h i ) < i g. Then V 1 ( 1 ) V 2 ( 2 ) is an open neighborhood of R \ R. We claim that this neighborhood is absorbed by R \ R. Let 2 2 V 2 ( 2 ), h 1 2 H 1, f 1 2 F 1. Now u 1 (h 1 ; 2 ) X F 2 2 (f 2 )(x 1? (h 1 )) + X H 2 2 (h 2 )m + 1 < (1? 2 )(x 1? 1 ) + 2 m + 1 < (1? 2 )x 1 + 2 m? 1 X F 2 2 (f 2 )x 1 + X H 2 2 (h 2 )u 1 (f 1 ; h 2 ) = u 1 (f 1 ; 2 ) Hence, there are no best replies against strategies in V 2 ( 2 ) in H 1. Hence, the best replies are in F 1. Reversing the roles of the players then proves the claim. Hence, every persistent retract is contained in R. 2 12
References 1. R. Aumann and S. Sorin, Cooperation and bounded recall, Games Econ. Behav. 1 (1989), 5-39. 2. D. Balkenborg, \The properties of persistent retracts and related concepts," Ph.D. thesis, University of Bonn, October 1992. 3. K. Basu and J.W. Weibull, Strategy subsets closed under rational behavior, Econ. Lett. 36 (1991), 141-146. 4. E. Ben-Porath and E. Dekel, Signaling future actions and the potential for sacrice, J. Econ. Theory 57 (1992), 36-51. 5. A. Blume, \Communication, risk and eciency in games," mimeo, University of Iowa, 1993. 6. E. van Damme, Stable equilibria and forward induction, J. Econ. Theory 48 (1989), 476-496. 7. S. Hurkens, \Learning by forgetful players: From primitive formations to persistent retracts," CentER Discussion Paper 9437, Tilburg University, 1994. 8. E. Kalai and D. Samet, Persistent equilibria in strategic games, Int. J. Game Theory 13 (1984), 129-144. 13
Footnotes 1. Blume [5] shows that the equilibria in a curb retract of the one-sided cheap talk game select the equilibrium preferred by the sender, provided that the risk associated with this equilibrium in the underlying game is suciently low. Indeed, if the payo to (S; t) is changed from `0; 4' into `2; 4' (which reduces the risk of (S; s)), then the unique curb retract does not contain m 0 T, m 1 T or tt, and all equilibria in the curb retract yield 9; 5. 2. We have analyzed the game with pre-play communication in the reduced normal form. Our results for curb and curb* extend to the agent normal form, but the results for persistence do not. This need not come as a surprise, since the agent normal form has more than two players. 14