Fast convergence in evolutionary models: A Lyapunov approach

Size: px

Start display at page:

Download "Fast convergence in evolutionary models: A Lyapunov approach"

Beatrice Harper
5 years ago
Views:

1 Fast convergence in evolutionary models: A Lyapunov approach Glenn Ellison, Drew Fudenberg, Lorens A. Imhof January 6, 015 Abstract Evolutionary models in which players are repeatedly matched to play a game have fast convergence to a set A if the models both reach A quickly and leave A slowly, where quickly and slowly refer to whether the expected hitting and exit times remain bounded when tends to infinity. We provide simple and general Lyapunov criteria which are sufficient for reaching quickly and leaving slowly. We use these criteria to determine aspects of learning models that promote fast convergence. Keywords: hitting time, learning model, Lyapunov function, Markov chain, recency 1 Introduction In this paper we provide a Lyapunov-function condition for a finite-population evolutionary process to reach various target sets quickly in the sense that the expected waiting time is uniformly bounded over all initial conditions and population sizes. We also provide a complementary condition under which the process leaves the target set slowly, in the sense that the probability of getting more than ɛ away from the set in any fixed time goes to zero as the population size increases. As this latter property seems necessary for convergence to the target set to be interesting, we only say that there is fast convergence when both conditions hold. Our starting point is a collection of time-homogeneous Markov chains S = {S (t) : t = 0, 1,,...} with finite state space Ω that can vary with the parameter, which we will use to index the number of players in the population. These Markov chains may track We thank Ryota Iijima, Gabriel Kreindler, Larry Samuelson, and Bill Sandholm for helpful comments and ational Science Foundation grant SES for financial support. Massachusetts Institute of Technology, Harvard University, and Bonn University. gellison@mit.edu, dfudenberg@harvard.edu, limhof@uni-bonn.de. 1

2 for example the play of each player at each location of a network. We then define functions φ on Ω that map to a space of fixed dimension, and consider the processes given by X (t) = φ (S (t)). For example, these processes can describe the frequency of agents using each of a finite number of pure strategies in a game. We present a pair of general results that use a Lyapunov function V to provide sufficient conditions for reaching quickly and leaving slowly. The first proposition says roughly that the system reaches a subset A of quickly if V (X (t)) decreases in expectation at rate at least c > 0 whenever the state X (t) is outside A. Intuitively, if V (X (t)) is declining at rate c in expectation, then if it is currently at x it should reach A in about V (x)/c periods. The second proposition provides a closely related condition for leaving A slowly: If the maximum rate at which the state can move in a single step is bounded and V (X (t)) is decreasing in expectation at rate c whenever X (t) is outside A, then the system will leave A slowly. Intuitively, this is because getting more than ɛ away from A would take large number of steps against the drift. We also consider extensions to the case where the expected decrease of the Lyapunov function decreases as the limit is approached, and to the case where a set is reached quickly through a multi-step process that need not monotonically approach the limit. Section 4 uses our general results to examine whether convergence is fast in various evolutionary models. The first subsection examines models in which the evolution of the process depends only on the fractions of players using each pure strategy, as in Kreindler and Young s (013) study of evolution with logistic choice. Here we show that there will be fast convergence to a neighborhood of the state in which all agents use one particular strategy whenever the popularity of that strategy is increasing in expectation when the state is outside the neighborhood. Such an argument is sufficient to establish fast convergence under a variety of different specifications of how strategies are revised, including stochastic choice, observation errors, and beliefs generated by random sampling. In some cases there is fast convergence if choice is sufficiently noisy (relative to the degree of risk dominance) while in others there is fast convergence for all noise levels. The second subsection discusses models that are naturally analyzed using more complex state spaces. Here it is particularly advantageous that our Lyapunov results let us work directly with random models (as opposed to taking a two-step approach of establishing that the model converges in a sufficient sense to some deterministic limit dynamic as, and showing that the limit dynamic reaches a limit quickly.) One application here is a finite-memory fictitious play model in which we show that there is fast convergence to a neighborhood of the risk dominant equilibrium if players place enough weight on their most recent observations. The second is a

3 model with both local and uniform interactions in which there is fast convergence to the set where most players use the risk dominant action if the noise level is above a specified threshold. Model and definitions Suppose that for each integer = 1,,... we are given a discrete time homogeneous Markov chain S = {S (t) : t = 0, 1,,... } with finite state space Ω. 1 Let be the (m 1)-dimensional unit simplex for some fixed integer m > 1, that is, = {x [0, 1] m : x x m = 1}. In our applications to models of populations playing a game with m pure strategies, a point x gives the fraction of players using each of the m strategies. Given some function φ : Ω, let X be the stochastic process on determined by X (t) = φ (S (t)). We denote the conditional probability and the conditional expectation given S (0) = s by P s and E s. For every A let Ā denote its closure. To discuss the speed with which the process X reaches the set A we write τ A = inf{t 0 : X (t) A} for the random variable giving the time at which A is first reached and define W (A, s) = E s (τ A ) to be the expected wait to reach A conditional on the process S starting at state s. Our first main definition is Definition 1. The family {X } reaches A quickly if lim sup sup W (A, s) <. s Ω ote that as in Kreindler and Young (013) this is an asymptotic property meaning that the expected waiting time remains bounded (uniformly over all starting points) in the limit. Our second main definition is Definition. The family {X } leaves A slowly if for any finite T and for every open set U containing Ā lim max s φ 1 (A) P s (τ \U T ) = 0. (1) 1 We believe that the assumption that S has a finite state space is not critical; we use it as a convenient way to avoid technical difficulties. 3

4 ote that the requirement in the definition is stronger than requiring that lim s φ 1 min W ( \ U, s) =. (A) We made the definition more demanding in this dimension so that we will not count a model as leaving a set slowly just because there is some probability of being trapped within A for a very long time. Instead, it must be the case that even for very large T, the probability of escaping within T periods vanishes in the limit. Finally, we define fast convergence as the combination of these two properties. Definition 3. The family {X } has fast convergence to A if {X } reaches A quickly and leaves A slowly. 3 Lyapunov criteria This section contains several sufficient conditions for reaching quickly and leaving slowly. 3.1 Main results We first present results relating the two components of fast convergence to the existence of Lyapunov functions satisfying certain properties. Proposition 1 contains a Lyapunov condition for {X } to reach a given set quickly. Where applicable, it also provides an explicit upper bound for the expected hitting time. To provide some intuition for the result, suppose that the Markov processes are deterministic and there is a nonnegative function V for which V (X (t)) decreases by at least c/ in the next 1/-length time interval whenever X (t) is outside A. Clearly, when such a process starts at x it must reach A within V (x)/c units of time. The proposition extends this simple argument to the case when the Markov process is not deterministic, but V (X (t)) still decreases in expectation at rate c. Proposition 1. Let A, c (0, ), and let V : [0, ) be a bounded function. If E s [ V ( X (1/) ) V ( X (0) )] c for all s φ 1 ( \ A), () It would have been more accurate to say {X } is likely to remain close to A for a long time rather than leaves A slowly, but we prefer the simpler wording. For example, for m = a deterministic system which immediately transitions from (0, 1) to ( 1, 1 1 ), but then immediately returns to (0, 1) is defined to leave {(0, 1)} slowly. 4

5 then W (A, s) V (φ (s))/c for every s Ω. sufficiently large, then {X } reaches A quickly. In particular, if () holds for all Proof. Fix 1, define Y (t) = X (t τa ), and let Z(t) be 0 or 1 according as Y (t) A or Y (t) A. By (), for t = 0, 1,,..., we have E [ ] V (Y (t + 1/)) V (Y (t)) Y (t) = y, S c (t) = s t, for any (y, s t ) with y A that occurs with positive probability. The LHS of the above inequality is identically zero if y A. So we can combine these two observations and write E [ V (Y (t + 1/)) V (Y (t)) Y (t), S (t) ] c Z(t). The expected change in the value function conditional on the state s at t = 0 can be computed by iterated expectations as E s [V (Y (t + 1/)) V (Y (t))] = E s [ E [ V (Y (t + 1/)) V (Y (t)) Y (t), S (t) ]]. This gives E s [V (Y (t + 1/)) V (Y (t))] c E s [Z(t)], which is equivalent to 1 E s [Z(t)] 1 c (E s [V (Y (t))] E s [V (Y (t + 1/))]). The expected time to reach A is given by 1 E s[ k=0 Z( k )]. The partial sums are bounded above: m 1 1 E s Z k=0 ( ) k 1 c m 1 ( Es [V (Y (k/))] E s [V (Y ((k + 1)/))] ) k=0 = 1 ( Es [V (Y (0))] E s [V (Y (m/))] ) c 1 c E s [V (Y (0))] = 1 c V (φ (s)). Hence, by monotone convergence, W (A, s) V (φ (s))/c. Remark. The proof resembles that of Foster s theorem (e.g. Brémaud (1999)). Like that proof, the one here has the flavor of martingale arguments although it does not appeal to martingale results. Our second proposition provides a criterion for {X } to leave a set A slowly. It requires that the Lyapunov function decrease in expectation whenever X (t) is slightly outside A. 5

6 This suffices because we also assume that there is an upper bound on the rate at which the process can move. 3 As a result, whenever the process does jump out of A it first reaches a point slightly outside A. The Lyapunov condition then ensures that for large the system is unlikely to escape the neighborhood before being drawn back into A. Proposition. Suppose there is a constant K < such that P s ( X ( 1 ) X (0) K ) = 1 for all and all s Ω, where is the Euclidean norm on. Let A be a set with φ 1 (A) for all 0 for some 0. Let c (0, ) and let V : R be a Lipschitz continuous function such that V (x) < V (y) for all x Ā, y \ Ā. (3) Suppose there is an open set U 0 that contains Ā and E s [ V ( X (1/) ) V ( X (0) )] c (4) whenever s φ 1 (U 0 \ Ā) and 0. Then {X } leaves A slowly. Remark. In the Appendix we prove the stronger result that there is a constant γ such that for large the exit time τ \U first order stochastically dominates an exponential random variable with mean e γ. The argument writes the exit time as the sum over all instances that the process reaches Ā of the transit times to either reach \ U or to return to Ā. Each such transit takes at least one unit of time, so it suffices to show that the number of transits is large enough. The key step is to show that the probability of reaching \ U before returning to Ā when starting in Ā is small (and declines exponentially in ). Intuitively, this is true because the bound on the speed with which the process can move implies that any initial step out of Ā reaches a point that is bounded away from \ U, and because the drift is toward Ā and many steps against the drift are needed to reach \ U, it becomes increasingly unlikely that a transit will lead to \ U rather than to Ā when is large. The technical argument focuses on the random variable Y (t) e δ 0V (X (t)). Using the hypotheses that X (t) moves in bounded steps and V is Lipschitz continuous, we can approximate the exponential in the definition of Y using a Taylor expansion and show that for an appropriately small value of δ 0, the random variable Y (t) is a positive supermartingale whenever X (t) is between Ā and \ U. This allows us to apply general results on positive supermartingales to conclude that the probability of reaching \ U before returning to Ā is at most e γ for some constant γ. 3 For example, this would hold if at most K players change their strategies in each 1/-length time interval. 6

7 The hypotheses of Propositions 1 and are similar. One difference is that Proposition is more demanding in that it has added a bound on the speed with which the process can move. Another is that Proposition 1 requires that the Lyapunov condition holds on a larger set (whenever X (t) is outside A versus just when X (t) is in some neighborhood of Ā). When the more restrictive version of each hypothesis holds the process will both reach A quickly and leave A slowly. Hence, we have fast convergence to A. Corollary.1. Suppose the hypotheses of Proposition are satisfied and that condition (4) holds also for all s φ 1 ( \ A) when 0. Then {X } has fast convergence to A. Proof. Any Lipschitz continuous function on is bounded, and replacing V (x) by V (x) min y V (y) if necessary, one may assume that V is nonnegative. Hence, the hypotheses of Propositions 1 and are satisfied. 3. Extensions Some models will not satisfy the hypotheses of the results above because the expected decrease in the Lyapunov function decreases to 0 as the state approaches the target set A. In the case of Proposition 1, one will often be able to slightly weaken the desired conclusion and argue that for any open neighborhood U of A, the model reaches U quickly. This will follow if there is a positive lower bound on the rate at which the Lyapunov function decreases when x \ U. In the case of Proposition we can do even better because a drift that vanishes at A is not a problem. To describe this formally, for A, let U(A, ɛ) denote the ɛ-neighborhood of A in, U(A, ɛ) = {x : inf y A x y < ɛ}. Proposition 3. The conclusion of Proposition that {X } leaves A slowly remains true if the Lyapunov hypothesis is replaced with Suppose there is an open set U 0 that contains Ā and for every ɛ > 0 there are numbers c ɛ > 0 and ɛ 0 such that ( ( 1 ) E s [V ) X V ( X () (0) )] c ɛ ( for all s φ 1 U0 \ U(Ā, ɛ)) and ɛ. Proof. Let U be an open set containing Ā. Let v 1 := max x Ā V (x) and choose v > v 1 so small that Ã := {x : V (x) v } U U 0. Then choose ɛ > 0 so that U(Ā, ɛ) Ã. ow apply Proposition with A, c, 0 replaced by Ã, c ɛ and ɛ. This yields that {X } leaves Ã slowly. Thus, as A Ã U, lim s φ 1 max P s (τ (A) \U T ) lim s φ 1 7 max P s (τ (Ã) \U T ) = 0

8 for all T [0, ). This proves that {X } leaves A slowly. In some cases models have fast convergence even though they do not always drift toward the selected set. For example, in a model where agents are arranged on a circle to play a coordination game and play a best response to the average play of their k closest neighbors unless a mutation occurs, the number of players playing the riskdominant action is not monotonically increasing it can decrease in expectation if there is not a sufficiently large cluster of players playing the risk-dominant action. Ellison (1993) nonetheless shows that play reaches a neighborhood of the state where everyone plays the risk-dominant equilibrium quickly. Intuitively, this occurs because evolution can proceed in a two-step manner: Each period there is a nonzero chance that a cluster of players using the risk-dominant action will form, and whenever such a cluster exists, the model drifts toward everyone playing the risk-dominant equilibrium. Our results can be extended so that they apply to some models of this variety. Specifically, the proposition below shows that {X } reaches a set A quickly when three conditions hold: (1) {X } reaches a superset B of A quickly; () {X } does not stay too long in B \ A, that is, {X } reaches A B c quickly; and (3) the probability that {X } reaches A before B c when starting anywhere in B \ A is bounded away from zero. The proof uses an argument related to Wald s equation for the expectation of a random sum of i.i.d. random variables. A transition from B c to A consists of a random number of transitions from B c to B \ A and back to B c with a final transition to A. The assumptions ensure that the expected lengths of the individual transitions are bounded and that the number of these transitions has a finite expectation as well. Proposition 4. Let A B. Suppose {X } reaches B quickly, {X } reaches A B c quickly, and there exist c > 0 and 0 such that for all 0, P s (X (τa B c) A) c for all s φ 1(B \ A). (5) Then {X } reaches A quickly. The proof is contained in the Appendix. We will use this result in the proof of Proposition 6 in our applications section. 4 Applications This section analyzes two sets of applications to non-equilibrium game dynamics. Subsection 4.1 analyzes models where the evolution of the system depends only on the number 8

9 of agents using each strategy, so that the state space Ω can simply be these numbers. Here, we connect fast convergence with whether the popularity of some strategy is increasing in expectation, and derive several results about particular models as corollaries. Subsection 4. considers two examples in which the natural state spaces are more complex: one is a model of personal experience learning; the other is a model with local and global interactions. 4.1 Dynamics based on population averages and random samples Suppose that a population of players are choosing strategies for a game with m pure strategies at t = 0, 1,,.... Throughout this section we assume that in each time period a single agent is chosen at random to revise his strategy. (With a time renormalization this could describe a model in which revision opportunities arrive in continuous time according to independent Poisson processes.) Let X i (t) denote the fraction of agents that play strategy i at time t. If the current state of the population is x, suppose the revising agent chooses strategy i with probability f i (x), regardless of the agent s own current action. 4 For now, we let f 1,..., f m : [0, 1] be arbitrary functions with f 1 (x) + + f m (x) = 1 for all x. Part (a) of our general result shows that if the probability of choosing some strategy i exceeds its current share whenever that share is below some threshold a, then there is fast convergence to the states where the share of i exceeds a. Part (b) gives a partial converse. Proposition 5. Consider a dynamic {X } with choice rules f 1 (x),..., f m (x). Let a, c (0, 1) and i {1,..., m}. (a) If f i (x) x i c whenever x i a, then {X } has fast convergence to {x : x i > a} and W ({x : x i > a}, X (0)) 1 c. (b) If f i is continuous and f i (x) < a whenever x i = a, then {X } leaves {x : x i < a} slowly. 4 The assumption that the f i do not depend on the current action rules out models in which agents respond to the current play of all others not including themselves, but since the effect of any one agent s strategy on the overall state is of order 1/, we do not expect it to matter for large except perhaps in knife-edge cases. 9

10 Proof. Here φ (x) = x, S = X, and Ω = {x : x m 0 }. (a) Let V (x) = 1 x i. Fast convergence to {x : x i > a} then follows from Corollary.1 since for every x Ω with x i a, E x [V ( ( 1 )) X V ( X (0) )] [ = E x Xi ( 1 ) ] Xi (0) = 1 (1 x i)f i (x) + 1 x i[1 f i (x)] = 1 [f i(x) x i ] c. The uniform bound on the convergence time follows from Proposition 1. (b) Set c = 1 sup{f i(x) x i : x, x i = a}. The set of x with x i = a is compact so c > 0. The uniform continuity of f i implies that we can choose an ɛ > 0 so that f i (x) x i c whenever x i [a, a + ɛ]. The assertion follows from Proposition with V (x) = x i. Remark. The continuity of f i is used in part (b) only to show that f i (x) x i is negative and bounded away from 0 when x i is in an interval around a. Accordingly, part (b) could be strengthened to apply to discontinuous f i in a variety of ways. For example, it would suffice to assume that there exists a c > 0 and an a > a such that f i (x) x i c whenever x i (a, a ). In view of Proposition 3 it is also sufficient to assume that f i is upper semicontinuous with f i (x) x i < 0 whenever x i (a, a ). ow we present several examples that use this result in coordination games under different dynamics. In each case we assume that strategy 1 is risk dominant. Example 1 generalizes Kreindler and Young (013) s analysis of logit responses in games to games with m strategies. Specifically, write π j : R for the expected payoff function of each strategy j given the state x and suppose that for a constant β (0, ), a revising agent chooses strategy i with probability f i (x) = f i (x, β) = e βπ i(x) m j=1 eβπ j(x).5 Kreindler and Young (013) show that when β is small enough (which is a form of enough noise condition) in a two-strategy game the expected popularity of the risk-dominant equilibrium is monotonically increasing until a neighborhood of the state where everyone plays it is reached, and that the state reaches this neighborhood quickly. We generalize this 5 Such choice rules can arise either from payoff shocks with an exteme-value distribution or from maximizing a weighted sum of expected utility and entropy. Fudenberg and Kreps (1993), Fudenberg and Levine (1995) respectively introduced these two types of general stochastic choice rules in the study of learning and fictitious play; Blume (1993) introduced the logit case in evolutionary models and noted the role of the noise level there. 10

11 result to m-action coordination games and strengthen the characterization from reaches quickly to fast convergence. 6 Example 1. Consider the model above. Suppose that for all x with x 1 1, where π 1 (x) α 1 x 1 and π i (x) α x i for i =,..., m, α 1 > (m + eβα ) eβ and α 0. Then the logit dynamic {X } has fast convergence to {x : x 1 > 1 }. The proof is in the Appendix; it uses the Schur-convexity of m i= eβα x i to show that the hypothesis of Proposition 5 is satisfied. ote that this hypothesis involves the relationship between α 1 and β: A greater degree of risk dominance is usually required if there is less noise, because otherwise play can get stuck at the dominated equilibrium. For example, in a two-action pure coordination game where the payoff to the equilibrium where both players use action is α = 1, there is fast convergence to the set where the majority of players play action 1 when the payoff to coordinating on action 1 is at least for β = 1, at least e for β =, and at least e /3 5 if β = 3. 7 The next example generalizes the previous one by supposing that the stochastic nature of the revision process arises from agents maximizing a perturbed utility function of the form U(p) = i p iπ i (x) c(p i )/β as in the literature on fictitious play, where 1/β is a measure of the amount of noise in the system, and c : (0, 1] R is continuously differentiable, strictly convex, and satisfies the Inada condition that lim p 0 c (p) = ; 8 the logit choice function corresponds to c(p) = p ln p. Hofbauer and Sandholm (00) show that all random utility models where agents maximize the sum of their payoff plus a random shock can also be regarded as members of this class, and that the class is strictly more general in that the 6 Our assumptions on payoffs are slightly broader than pure coordination games. 7 The inequality never holds when β is in a neighborhood of 0, because then choice is almost independent of payoffs, and the risk-dominant equilibrium will only be played by a 1/m fraction of agents. In the m = case even random choice will lead to half of the agents playing strategy 1 so it is very easy to get fast convergence; here Kreindler and Young (013) show that even α 1 > α suffices whenever β. 8 Fudenberg and Levine (1995) show how this leads to stochastic fictitious play, and generates Hannanconsistent choice, meaning that its long-run average payoff is at least as good as the best response to the time average of the moves of ature and/or other players. Hofbauer and Sandholm (00), Benaïm, Hofbauer, and Hopkins (009), and Fudenberg and Takahashi (011) use perturbed utility to construct Lyapunov functions for stochastic fictitious play, and study perturbed utility in an evolutionary model. Fudenberg, Iijima, and Strzalecki (014) show that this form of stochastic choice corresponds to the behavior of an agent who is uncertain about his payoff function and has a form of variational preferences and so randomizes to guard against moves by a malevolent ature. 11

12 cost function c(p) = ln p (and many others) do not correspond to any form of random utility. Still, we can show that in coordination games the system where revising agents act to maximize U(p) satisfies the monotonicity requirement of Proposition 5 (a) if there is enough noise and so the system has fast convergence to the set where more than 1/ the agents use the risk-dominant strategy: Example. Consider the model above, i.e. suppose action 1 is risk dominant in a coordination game and let f 1 (x) = arg max p [pπ 1 (x)+(1 p)π (x) (c(p)+c(1 p))/β] with c satisfying the conditions above. Then there is fast convergence to the set {x : x 1 > 1 }, provided β is sufficiently small. The proof is in the Appendix and notes that the example is a special case of Proposition 7 below. It can also be derived using Proposition 5. The next example considers a different behavior rule for revising agents: Instead of playing a stochastic best response to the current distribution of strategies, players use an exact best response to a noisy signal of the state. Specifically, at each time t = 1,,..., one randomly chosen agent i draws, with replacement, a sample of size s from the current population. Assume that rather than correctly recognizing how each sampled agent j is playing, agent i with probability 1 ɛ correctly recognizes how j is playing, but with probability ɛ instead thinks j is playing another action, with all strategies other than the true one being equally likely, and assume that observation errors are independent across all observations. The revising agent i then adopts the best response to this set of observations. Here, we consider more general m m games, but assume that strategy 1 is strongly risk dominant in the sense that agent i will choose strategy 1 if he observes at least one player playing strategy 1. ote that if ɛ = 1 1, then each observation is equally likely to be m each pure strategy irrespective of actual play. In this model we find that there is fast convergence to the set of states where the popularity of strategy 1 exceeds a threshold. In contrast to the earlier examples of this section, we get fast convergence here even when the noise level is arbitrarily low. Example 3. In the model described above, if strategy 1 is strongly risk dominant and the number of agents sampled is at least two, then {X } has fast convergence to {x : x 1 > 1 (1 1 m )s } for every ɛ (0, 1 1 m ). We defer the calculations to the Appendix. The key once again is that the probability of choosing the risk-dominant strategy 1 is above its current share as long as the share is below the threshold. This follows from the fact that the above probability is bounded 1

13 below by a concave function of that share, with the bound being above the share at x 1 = 0 and equal to it at x 1 = 1 (1 1 m )s. As a final application of Proposition 5 we suppose that agents beliefs derive from correctly observing a random sample of play. Roughly, our results say that such noisy beliefs speed evolution in the sense that sampling enlarges the set of parameters for which evolution is fast. More precisely, we return to coordination games and we compare the evolution of the Markov chain X describing the dynamics where the revising agent chooses strategy 1 with probability g(x) if the current state of the population is x with the evolution of the chains Xs in which updating agents apply the same rule g to a random sample of size s. 9 In the random sampling model the probability f (s) (x) that a revising agent chooses strategy 1 is f (s) (x) = s g k=0 ( k s We begin by considering when the implication ) ( ) s x k (1 x) s k, x [0, 1]. k {X } reaches ( 1, 1] quickly {X s } reaches ( 1, 1] quickly (6) is true. 10 Proposition 6 provides a symmetry condition on g under which implication (6) holds for every s. This generalizes Kreindler and Young s (013) result beyond logit best responses and strengthens the results to show there is fast convergence in the model with sampling beliefs whenever there is fast convergence with full information (as opposed to when a sufficient condition for reaching quickly holds). To understand the condition we place on g and why it is a generalization, note that when sup 0<x< 1 g(x) < 1, {X } cannot reach ( 1, 1] quickly. So, ignoring a knife-edge case, we assume there exists x (0, 1) with g(x ) 1. In two-action games with strategy 1 being risk dominant, the logit model satisfies a stronger version of the symmetry condition: g(x + x) + g(x x) = 1 for all x [0, x ] where x is the mixed-strategy equilibrium. We loosen this to only require an inequality. 11 Proposition 6. In the above model of learning in a game let x (0, 1 ). Suppose that g(x + x) + g(x x) 1 for all x [0, x ] and that g is strictly increasing. Suppose {X } reaches [x, 1] quickly. Then, there exists ξ > 1 such that {X } and {X s } have fast convergence to (ξ, 1] for every s. 9 Here, the state is the fraction of agents using strategy In the analysis of the relation between {Xs } and {X } we take advantage of the fact that the functions f (s) are the Bernstein polynomials of g to exploit known results about properties of these polynomials. 11 We also have strengthened (6) in that we relax the assumption that {X } reaches ( 1, 1] quickly to the assumption that {X } reaches [x, 1] quickly. 13

14 Our next result sharpens the message of the previous one by showing that there is a range of parameter values for which there is fast convergence with random sampling beliefs, but not when players observe the full state. Consider as above a family of decision rules g(x, β), where 1/β is a measure of the level of noise. Proposition 7. Consider a -action game with affine payoff functions π i and let x (0, 1 ) be such that sign(π 1 (x) π (x)) = sign(x x ). Suppose that the choice rule g can be written in the form g(x, β) = P [β(π 1 (x) π (x)) ɛ] where the support of ɛ is R, ɛ and ɛ have the same distribution, and P [ɛ = 0] < 1 x. Then there exist 0 < β < β s such that (i) if β (0, β ) then both models have fast convergence to the set ( 1, 1]. (ii) if β (β, β s) then the system with random sampling has fast convergence to ( 1, 1] but the system with full information does not. Intuitively, fast convergence requires a sufficient amount of noise, and random sampling provides an additional stochastic element without breaking the monotonicity needed to appeal to Proposition 5. ote that the condition of this proposition applies to choice rules generated by the perturbed utility functions considered above More complex state spaces In the above examples, the state space of the Markov process is naturally taken to be as the dynamics only depend on the population shares using each strategy. Many other interesting models require a more complex state space. For example, the probability that a player adopts some strategy may depend on the player s position in a network, who he has been matched with previously, or some other factor. An attractive feature of our Lyapunov approach is that it also applies to a variety of such models. In this subsection we first discuss two applications: one involving learning from personal experience with recency weights, and one in which agents have both local and uniform interactions. First, consider a finite-memory fictitious-play style learning model in which agents are matched to play a two-action two-player game G at t = 0, 1,,... and learn only from play in their own interactions. 13 Assume that strategy 1 is p-dominant for some p < 1 and that 1 ote also that the assumption that the support of ε is R implies that g(x, β) is always strictly between 0 and 1, i.e. choice is always stochastic. This is needed for the result: when c(p) = p as in Rosenthal (1989) and Voorneveld (006) the choice rule is deterministic at some states and as we show in the Appendix the conclusion of the proposition is false. In this case, there is a β > 0 such that both models have fast convergence to ( 1, 1] if 0 < β < β and the state 0 is an absorbing state for both models if β β. 13 Ellison (1997) studied the finite-memory variant as well as the more traditional infinite-memory fictitious play model with the addition of a single rational player, and showed that the rational player could manipulate the play of a large population of opponents when one action was strongly risk-dominant. Most 14

15 agents remember the outcomes of their k most recent matches. 14 Within each period the players are randomly matched and play the game using strategies they selected at the end of the previous period. One player is then randomly selected to update his or her strategy. The updating player does one of two things. With probability 1 ɛ he selects a strategy which is a best response to a weighted average w 1 a i,t + w a i,t w k a i,t (k 1) of the actions a i,t,..., a i,t (k 1) that his opponents have used in the most recent k periods. With probability ɛ he selects a strategy uniformly at random. The selected strategy will be used until the player is next chosen to update. Informally, a motivation for using recency weights is that agents would want to place more weight on more recent observations if they believed that their opponents play was changing over time. There is ample experimental evidence suggesting that beliefs are indeed more heavily influenced by recent observations both in decision problems and in games; see Cheung and Friedman (1997) for one of the first measures of recency bias in a game theory experiment, and Erev and Haruvy (013) for a survey of evidence for recency effects in experimental decision problems. Benaïm, Hofbauer and Hopkins (009) and Fudenberg and Levine (014) provide theoretical analyses of recency, but neither consider the largepopulation limit that is our focus here. We show that this model has fast convergence to a strongly risk dominant equilibrium if the weights place enough emphasis on recent actions. ote that this result does not require that noise levels be above some threshold, and that play converges to an arbitrarily small neighborhood of the selected equilibrium when the level of noise is sufficiently low. To state the result formally, note that the model defines a Markov process S (t) on the state space Ω = {1,,..., m} k+ : the first k components of the state vector record what each player saw in the periods t 1, t,..., t k ; and the last components record the action that each player has selected for use in period t. Given any state S (t) we can define a random variable X (t) to be the fraction of players who have each pure strategy as their selected action in state S (t). Example 4. Consider the model above with k > 1. Suppose that strategy 1 is p-dominant in G and the recency weights satisfy w 1 > p and w + w 3 > p. Then, for any ɛ > 0 the model described above has fast convergence to {x x 1 > 1 1.ɛ}. Remarks. other studies of fictitious play assume that all agents in the same player role have the same beliefs. One exception is Fudenberg and Takahashi (011), who allow each of agents to have different beliefs; they focus on the asymptotics for a fixed rather than the large-population limit. 14 That is, strategy 1 is the unique best response to every mixed strategy that assigns at least probability p to strategy 1 (Morris, Rob, Shin (1995)). 15

16 1. The result implies there can be fast convergence even with long memories and moderate levels of p-dominance provided that players place substantial weights on their most recent experiences. For example, if players place weight proportional to ( 3 on their nth most recent observation, then w 1 > 1/ ( ) = 1 and w + w 3 > + 4 > 1, so there is fast convergence to a neighborhood of strategy 1 if strategy 1 is 1-dominant. 3. Another form of recency weighting is to completely ignore all observations from more than k periods ago but weight all of the last k periods equally, so that w n = 1 k for n = 1,,..., k. In this case the result implies there is fast convergence to a neighborhood of the state where everyone plays strategy 1 if strategy 1 is 1 -dominant. This implies k that we always have convergence to a risk-dominant equilibrium if memories are short enough (but longer than one period). Proof of Example 4. Consider the two-period ahead dynamics of the model. By Propositions 1 and it suffices to show that we can find c > 0 for which inf s φ 1 (x) E[(X 1 (/) X 1 (0)) X (0) = x, S (t) = s] c for all x with x 1 [0, 1 1.ɛ]. We can evaluate the change in the popularity of strategy 1 by counting every time a player playing strategy 1 is selected to update as a loss of 1 and every adoption by an updating player as a gain of 1. The expected losses are x 1 from revisions at t = 0 and at most (x ) from revisions at t = 1. There will be a gain without a mutation from the period t revision if the player selected to update at t is matched with a player who uses strategy 1 in period t, or if he saw strategy 1 in both periods t and t 1. At t = 0 only the former is guaranteed to be possible for all s the worst case state is that the matching was such that the sets of players who saw strategy 1 in periods t 1 and t are disjoint so all we can say is that the expected number of adoptions is at least x 1 (1 ɛ). But the t = 1 revision will produce an adoption of the latter type if the player who saw strategy 1 in period t = 1 is randomly matched with a player playing strategy 1 at t = 0 and is then randomly selected to update at t = 1. So the expected number of non-mutation adoptions is at least ( (x 1 1 ) + (1 x )x 1(x 1 1 )) (1 ɛ). And there are ɛ adoptions in expectation due to mutations in each of the two periods. Adding all of these changes together and ignoring all of the 1 terms, it suffices to show there is a c > 0 for which ) n 1 x 1 + (x 1 + x 1 x 3 1)(1 ɛ) + ɛ > c, (7) 16

17 for all x 1 [1 lɛ], where we have written l for 1. to clarify the argument and suggest how it could be strengthened. ote that this is equivalent to x 1 x 3 1 > c + ( 1 + x 1 + x 1 x 3 1)ɛ. (8) It is easy to choose a c so that this will be satisfied for all x 1 [0, 1 3 ]. For such x 1 we have 1 + x 1 + x 1 x 3 1 = 1 + x 1 + x 1 (x 1 (1 x 1 )) = suffices. so any c < 7 7 ɛ For x 1 [ 1, 1 lɛ] the LHS of (7) is concave so it suffices to show that for some fixed 3 c equation (7) holds for both x 1 = 1 and x 3 1 = 1 lɛ. We already noted a restriction on c that makes (7) hold for x 1 = 1 3 so it remains only to show that it is satisfied for x 1 = 1 lɛ. To do this, we substitute (1 x 1 )/l for ɛ into (8) and find that it is equivalent to 1 x 1 l ( lx 1 ( 1 + x 1 + x 1 x 3 1) ) > c. A numerical calculation shows that the polynomial (1 x 1 + (l 1)x + x 3 ) is positive for all x 1 [0, 1] if l > 1.15, in which case one can (given any fixed ɛ) choose a c that satisfies (8) for all x 1 [ 1 3, 1 lɛ]. Remark. The result could be strengthened to show that there is fast convergence to a somewhat smaller set by considering k-period ahead transitions instead of the two-period ahead transitions considered in the proof. Second, consider the following local interaction model with a small-world element: players are arranged around circle and are randomly matched to play a coordination game. One player chosen at random considers updating his strategy in each period. With probability 1 ε > 1/ the updating player plays a best response to the average of the play in the previous period of four players: his two immediate neighbors and two players selected uniformly at random, 15 with probability ɛ the updating player chooses the opposite strategy. We focus on the case where strategy 1 is risk-dominant, but not 1 -dominant: an updating player will choose strategy 1 if at least two of the four players in his sample are 4 using it, but will use strategy if at least 3 players in the sample are using strategy. An application of our Lyapunov condition shows that the model will have fast convergence to the risk dominant equilibrium if the noise level is above a critical threshold which turns out to be not very large, even though clusters of players using strategy 1 in a 15 To simplify the algebra we assume that the random samples are independent draws from the full population (including the player himself and the immediate neighbors). 17

18 population of mostly strategy players tend to shrink slowly from the edges. Intuitively, when x 1 is very small the drift is to the right because of the ɛ noise. When x 1 is larger there are additional adoptions from players who have neighbors playing strategy 1 and/or see it in their random sample. This force dominates until almost all players are playing the risk dominant action. Example 5. Consider the model above. Suppose that strategy 1 is risk dominant. Then, for ɛ > the model has fast convergence to {x x 1 > 1 1.5ɛ}. The basic structure of the argument is similar to that for Example 4. The proof is in the Appendix. Another application that we do not develop here is the use of Proposition 4 to develop results about the selection of iterated p-dominant equilibria. We believe that this would lead to results like those in Oyama, Sandholm, and Tercieux (014). In addition, we conjecture that our results could be fruitfully applied to the network models of Kreindler and Young (014), where the dimension of the state space is the size of the population. 5 Related Literature Ellison (1993) raises the issue of slow convergence; it shows that in the model of Kandori, Mailath, and Rob (1993) the rate of convergence slows as the noise level converges to zero, and that the expected waiting time to reach the long-run equilibrium grows exponentially as the population size increases. Blume (1993) provides conditions under which an infinite-population local interaction model is ergodic; evolution can be thought of as fast for large if it still occurs when =. Möbius (000) defines a model as clustering on a set of states A if the probability of A under the ergodic distribution converges to 1 as, and analyzes the limit of the worst case expected wait to reach A. Kreindler and Young (013) defines a concept of fast selection which is roughly equivalent to what we call reaches A quickly. Relative to this the novelty in our concept consists of adding the leaves A slowly requirement. There is also a sizable literature on factors that promote fast evolution, including local interaction (Ellison (1993) and Blume (1993)), endogenous location choice (Ely (00)), homophily (Golub and Jackson (01)), and population structure (Hauert, Chen and Imhof (014)). Samuelson (1994) and Ellison (000) note that evolution is more rapid when it can proceed via a series of smaller steps. Kreindler and Young (014) provides sufficient conditions on payoffs and noise for evolution to be fast on arbitrary networks. The methodological parts of our paper are related to the literature on stochastic stability with constant step size, where the dimension of the Markov process is held fixed as the 18

19 population grows. 16 Benaïm and Weibull (003) considers models where all agents have the same beliefs (which is why the state space doesn t depend on the population size) and a random player from an agent population is chosen to revise his play at t = 1,, 3,... The paper shows that the maximum deviation between the finite model and its continuous limit over a T period horizon goes to zero as, and also that the probability of exiting the basin of attraction of an attractor of the continuous model by time T goes to zero (exponentially fast) as the population size increases; this is related to our leaving slowly results. Roth and Sandholm (013) develops more general results along these lines, and provides conditions that imply that play in the discrete-time model remains within ɛ of the dynamics of the continuous-time model for at least T periods with probability that goes to one as grows. The results can provide an alternate method for establishing some of our applied results in cases where the state space remains the same as the population grows. 6 Conclusion We defined a notion of fast convergence for evolutionary models, which refines the previous literature by requiring that some set A is both reached quickly and left slowly. We then used Lyapunov functions to give sufficient conditions for fast convergence. One advantage of our approach is that it can apply to models with state spaces that do not have a finitedimensional continuous time limit dynamic. At a conceptual level, our proofs separate the factors sufficient from fast convergence from technical conditions needed to assure a wellbehaved approach to the mean field. We illustrated the use of our conditions in various examples without presenting quantitative results but the simulations of e.g. Ellison (1993) and Kreindler and Young (013) make us optimistic that when convergence is fast in our sense it will be fast enough to be of practical importance. 16 Kaniovski and Young (1995) and Benaïm and Hirsch (1999) provide results connecting discrete-time and continuous-time limit dynamics in the context of fictitious-play style models, but the results themselves are not closely related because the approximation occurs in the t limit with the population size held fixed. 19

20 References [1] Michel Benaïm, Morris W. Hirsch, Mixed Equilibria and Dynamical Systems Arising from Fictitious Play in Perturbed Games. Games and Economic Behavior 9, [] Michel Benaïm, Josef Hofbauer, Ed Hopkins, 009. Learning in Games with Unstable Equilibria. Journal of Economic Theory 144, [3] Michel Benaïm, Jörgen W. Weibull, 003. Deterministic Approximation of Stochastic Evolution in Games. Econometrica 71, [4] Lawrence E. Blume, The Statistical Mechanics of Strategic Interaction. Games and Economic Behavior 5, [5] Pierre Brémaud, Markov Chains. Gibbs Fields, Monte Carlo Simulation, and Queues. Springer-Verlag, ew York. [6] Yin-Wong Cheung, Daniel Friedman, Individual Learning in ormal Form Games: Some Laboratory Results. Games and Economic Behavior 19, [7] Richard Durrett, Probability: Theory and Examples, second ed. Duxbury Press, Belmont. [8] Glenn Ellison, Learning Local Interaction and Coordination. Econometrica 61, [9] Glenn Ellison, Learning From Personal Experience: One Rational Guy and the Justification of Myopia. Games and Economic Behavior 19, [10] Glenn Ellison, 000. Basins of Attraction, Long Run Equilibria, and the Speed of Step-by-Step Evolution. Review of Economic Studies 67, [11] Jeffrey Ely, 00. Local Conventions. Advances in Theoretical Economics. [1] Ido Erev, Ernan Haruvy, 013. Learning and the Economics of Small Decisions. Handbook of Experimental Economics, Vol.. Alvin E. Roth and John Kagel (eds.), forthcoming. [13] Fudenberg, D., R. Iijima, and T. Strzalecki, 014. Stochastic Choice and Revealed Perturbed Utility, mimeo. 0

21 [14] Fudenberg, D., and D. K. Levine Consistency and Cautious Fictitious Play, Journal of Economic Dynamics and Control, 19, [15] Drew Fudenberg, David K. Levine, 014. Recency, Consistent Learning, and ash Equilibrium. Proceedings of the ational Academy of Sciences 111, [16] Drew Fudenberg, Satoru Takahashi, 011. Heterogeneous Beliefs and Local Information in Stochastic Fictitious Play. Games and Economic Behavior 71, 100-ï 110. [17] Benjamin Golub, Matthew O. Jackson, 01. How Homophily Affects the Speed of Learning and Best-Response Dynamics. The Quarterly Journal of Economics 17, [18] Christoph Hauert, Yu-Ting Chen, Lorens A. Imhof, 014. Fixation Times in Deme Structured, Finite Populations with Rare Migration. Journal of Statistical Physics 156, [19] Josef Hofbauer, William H. Sandholm, 00. On the Global Convergence of Stochastic Fictitious Play. Econometrica 70, [0] Michihiro Kandori, George J. Mailath, Rafael Rob, Learning, Mutation, and Long Run Equilibria in Games. Econometrica 61, [1] Yuri M. Kaniovski, H. Peyton Young, Learning Dynamics in Games with Stochastic Perturbations. Games and Economic Behavior 11, [] Samuel Karlin, Howard M. Taylor, A First Course in Stochastic Processes, second ed. Academic Press, San Diego. [3] Gabriel E. Kreindler, H. Peyton Young, 013. Fast Convergence in Evolutionary Equilibrium Selection. Games and Economic Behavior 80, [4] Gabriel E. Kreindler, H. Peyton Young, 014. Rapid Innovation Diffusion in Social etworks. Proceedings of the ational Academy of Sciences 111, [5] George G. Lorentz, Bernstein Polynomials, second ed. Chelsea, ew York. [6] Albert W. Marshall, Ingram Olkin, Inequalities: Theory of Majorization and Its Applications. Academic Press, San Diego. [7] Stephen Morris, Rafael Rob, Hyun Song Shin, p-dominance and Belief Potential. Econometrica 63,

22 [8] Markus Möbius, 000. The Formation of Ghettos as a Local Interaction Phenomenon. In Local etworks: Theory and Applications, Ph.D. Thesis, MIT. [9] Daisuke Oyama, William H. Sandholm, Olivier Tercieux, 014. Sampling Best Response Dynamics and Deterministic Equilibrium Selection, Theoretical Economics, forthcoming. [30] Robert Rosenthal, A bounded-rationality approach to the study of noncooperative games, International Journal of Game Theory, 18, [31] Gregory Roth, William Sandholm, 013. Stochastic Approximations with Constant Step Size and Differential Inclusions. SIAM Journal on Control and Optimization 51, [3] Larry Samuelson, Stochastically Stable Sets in Games with Alternative Best Replies. Journal of Economic Theory 64, [33] Albert. Shiryaev, Probability, second ed. Springer-Verlag, ew York. [34] Mark Voorneveld 006. Probabilistic choice in games: properties of Rosenthal s t- solutions. International Journal of Game Theory 34,

Stochastic Adaptive Dynamics. Prepared for the New Palgrave Dictionary of Economics. H. Peyton Young. February 27, 2006

Stochastic Adaptive Dynamics. Prepared for the New Palgrave Dictionary of Economics. H. Peyton Young. February 27, 2006 Stochastic Adaptive Dynamics Prepared for the New Palgrave Dictionary of Economics H. Peyton Young February 27, 2006 Stochastic adaptive dynamics require analytical methods and solution concepts that differ