Fast Convergence in Semi-Anonymous Potential Games

Size: px
Start display at page:

Download "Fast Convergence in Semi-Anonymous Potential Games"

Transcription

1 Fast Convergence in Semi-Anonymous Potential Games Holly Borowski, Jason R. Marden, and Eric W. Frew Abstract The log-linear learning algorithm has been extensively studied in both the game theoretic and distributed control literature. One of the central appeals with regards to log-linear learning for distributed control is that it often guarantees that the agents behavior will converge in probability to the optimal configuration. However, one of the central issues with log-linear learning for this purpose is that the worst case convergence time can be prohibitively long, e.g., exponential in the number of players. In this work, we formalize a modified log linear learning algorithm that exhibits a worst case convergence time that is roughly linear in the number of players. We prove this characterization in semi-anonymous potential games with limited populations. That is, potential games where the agents utility functions can be expressed as a function of aggregate behavior within each population. I. INTRODUCTION Game theoretic learning algorithms have gained traction as a powerful design tool for distributed control systems [6], [8], [5], [3], [5]. Here, a static game is repeated over time and agents are given opportunities to revise their strategies in response to their local objective functions and other agents behavior. The emergent behavior associated with such revision strategies has been studied extensively in the existing literature, e.g., fictitious play [5], [4], [6], regret matching [], loglinear learning [], [3], [], among many others. While many existing results characterize desirable asymptotic guarantees, the convergence times associated with these algorithms remain uncharacterized or have been shown to be prohibitively long [4], [0], [], []. Characterizing convergence rates is a key component of determining whether a game theoretic algorithm is desirable for system control. The class of games known as potential games [7] has received significant research attention with regards to distributed control. A game is a potential game if each agent s local objective function is aligned with some global system level objective function. While potential games may not naturally emerge This research was supported by AFOSR grants #FA and #FA , by ONR grant #N , and by the NASA Aeronautics Scholarship Program. H. Borowski is a graduate research assistant with the Department of Aerospace Engineering, University of Colorado, Boulder., holly.borowski@colorado.edu. J. R. Marden is with the Department of Electrical, Computer, and Energy Engineering, University of Colorado, Boulder, CO 80309, jason.marden@colorado.edu. Corresponding author. E. W. Frew is with the Department of Aerospace Engineering, University of Colorado, Boulder, CO 80309, eric.frew@colorado.edu. in many social systems, it is important to highlight that for engineering systems there are methods for designing local agent objective functions such that (i the resulting game is a potential game and (ii the optimal behavior corresponds to a (pure Nash equilibrium of this designed game [], [5], [4]. Consequently, there has been significant research aimed at deriving distributed learning algorithm which converge to this efficient Nash equilibria in potential games. One such learning algorithm that accomplishes this task is termed log-linear learning [3]. Log-linear learning can be interpreted as a perturbed, or noisy, best reply process where the individual agents commonly select the best action given their belief regarding the behavior of other agents. However, the agents will periodically make mistakes and select suboptimal actions with a probability that decays exponentially with regards to the potential payoff loss. As the noise levels tend to zero, the structure of this perturbation model ensures that the resulting process has a unique stationary distribution with full support on the efficient Nash equilibria. Thus, by designing the agents local objective functions appropriately, log-linear learning can be employed to derive distributed control laws with highly desirable asymptotic guarantees. Unfortunately, convergence rates associated with loglinear learning in its nominal form have been shown to be exponential in the game size []. This stems from the inherent tension between desirable asymptotic behavior and resulting convergence rates. This tension arises from the fact that small noise levels are necessary for ensuring that the bulk of the mass of the stationary distribution lies on the efficient Nash equilibria; however, small noise levels also make it difficult to leave inefficient Nash equilibria which ultimately slows convergence rates. Nonetheless, positive results regarding convergence rates of log-linear learning and its variants have recently begun to emerge for specific problem instantiations [3], [8], []. In [], the authors introduce a log-linear learning variant and show that for a class of potential games, i.e., congestion games with symmetric agents and singleton strategies, the convergence rates grow roughly linearly in the number of agents. Furthermore, similar positive results were derived in [8] which focuses on a different class of potential games: local interaction game with a sufficiently sparse interaction structure. Accordingly, the two dominant questions that emerge are the following: (i Do such positive results extend to broader classes of games? (ii Is it possible to design

2 systems to fit into one of these two classes? The focus of this paper is on the first question. In particular, we introduce a class of games termed semi-anonymous potential games. Such games can be parameterized by populations of agents where each agent s local objective function can be evaluated using only information regarding the agent s own decision and the aggregate behavior within each of the populations. The classical congestion game, as introduced in [], represents a semi-anonymous potential game where a population corresponds to a collection of agents each with a common set of paths connecting their source and destination. First, we introduce a variant of log-learning learning for semi-anonymous potential games that closely resembles the algorithm introduced in []. Next, we prove that for semi-anonymous potential games, the convergence rates associated with this algorithm grows roughly linearly in the number of agents provided that the number of populations is bounded. This result can be viewed as an extension of [] which focuses purely on the single population setting. II. PRELIMINARIES Consider a game with agents N {,,..., n} where each agent i N has a finite action set denoted by A i and a utility function U i : A R where A i N A i denotes the set of joint actions. We will frequently express an action profile a A as (a i, a i where a i (a,..., a i, a i+,..., a n denotes the collection of actions of all agents other than agent i. Similarly, we let A i A A i A i+ A n denote the action sets of all players excluding i. We will denote a game G by the tuple G (N, {A i } i N, {U i } i N. Definition : A game G is a semi-anonymous potential game if there exists a partition N (N, N,..., N m of N such that the following three conditions are satisfied: (i For any population N i N and agents i, j N i we have A i A j. Accordingly, we say population N i has action set A Ni {a N i, a N i,..., a si N i } where s i denotes the number of actions available to population N i. (ii The utility function of any agent i N can be expressed as a lower-dimensional function of the form U i : A i R where m and j ( v j n, v j n,..., v s j j : n s j k v k j N j for any j {,..., m}. (iii There exists a potential function φ : R such that for any a A and agent i N with action a i A i, ( U i(a U i(a i, a i φ(x(a φ(x(a i, a i, ( where x : A captures each population s aggregate behavior in a. More specifically, For brevity, we refer to G by G (N, {A i }, {U i }. x(a (x (a, x (a,..., x m (a where, for any j {,..., m}, x j(a j n ff j N i : a j a k N n i o k,...,s i If each agent i N is alone in its respective partition, the definition of semi-anonymous potential games is equivalent to that of exact potential games in [7]. Example : A congestion game is an example of a semi-anonymous potential game. Consider a congestion game with players N {,..., n} and roads R {r, r,..., r k }. Road r R has congestion function C r : + R, where C r (k is the congestion on road r with number of users k. Player i s action set is A i R, and he selects a set of roads in A i to minimize his cost of J i (a i, a i r a i C r ( {a} r, where {a} r is the set of players choosing road r R under joint action profile a. This game has potential function φ : A R φ(a r R {a} r k C r (k. (3 When player actions sets are asymmetric, a congestion game is an semi-anonymous potential game. In the case where player action sets are symmetric, the congestion game is an anonymous potential game. III. MAIN RESULTS Consider the following algorithm, termed modified log linear learning, which is a mild variant of the algorithm introduced in []. Let a(t represent the action profile at some time t 0. For any population N j N, each agent i N j has a Poisson clock of rate αn/zj i(t, where zj i(t {k N j : a k (t a i (t} ; hence, a player has a higher update rate if the player is not using a common action. If player i s clock ticks, he chooses action a i A i A Nj probabilistically according to Prob[a i (t + a i ] e βui(ai,a i(t a eβui(ai,a i(t i Ai e βφ(x(ai,a i(t, (4 a i Ai eβφ(x(a i,a i(t where a i (t + indicates the player s new action choice after the update and the parameter β dictates how likely a player is to choose a high payoff action; as β, payoff maximizing actions are chosen, and as β 0, players choose from their actions sets with uniform probability. Accordingly, the new action is of the form a(t + (a i (t +, a i (t. Update rates are the only variation between this algorithm and the continuous time implementation of standard logit response. In the standard logit response dynamics, all have a fixed Poisson clock of rate which yields n expected updates per second. The expected number of updates per second for the modified log

3 linear learning algorithm is lower bounded by mαn and upper bounded σαn; hence, the number of updates is on the order of the updates in the standard logic response dynamics for any given α. To achieve an expected update rate which is at least as fast as the standard log linear learning update rate, we may assume that α /m so that the expected number of updates per second n. Note that for any α, the proposed dynamics give rise to an ergodic, reversible Markov process. The following theorem extends the results of [] to semi-anonymous potential games. Theorem : Let G (N, {A i }, {U i } be a semi anonymous limited population potential game with populations {N, N,..., N m }. Define n i : N i, σ : m i A N i, and s : m i A N i. Let the potential function φ be such that φ(x [0, ] for all x. Further, let φ be λ-lipschitz, i.e., φ(x φ(y λ x y, x, y, (5 and let m + m i n i σ. For any (0, and any initial state, under modified log linear learning, if j 4m(s 4m(s β max log ms, log 8msλ ff (6 then E[φ(x(a(t] sup φ(x (7 x for all t ms c e 3β m(m(s n 4α log log(n + ms m + log β + log where c is a constant that depends only on s. This proof of Theorem is provided in Appendix D. The results in [] focus on a special class of semianonymous potential games with a single population, i.e., m. Consequently, the bounds given in [] are better than the bounds provided in Theorem resulting from the fact that the aggregate state space,, grows significantly with the number of populations. The impact of this larger state space is realized with the dependence of the mixing time bound on the number of populations, m, both directly and indirectly in the requirement for parameter β of Equation (6. However, for a fixed number of populations, the mixing grows as n log log n. IV. ILLUSTRATIVE EAMPLE We present an example which supports our theorem that as n increases, the time to convergence in modified log linear learning grows as Θ(n log log n. Consider a resource allocation game with resources R {r, r, r 3 } and agents N {,,..., n} distributed into two populations, N and N. Let n be even and suppose each population contains n/ agents. A player in population N may choose one resource from {r, r }, and a player in population N may choose one resource (8 from {r, r 3 }. A state x can be represented as a tuple of the form x (x, x, x 3, x 4 where nx and nx are the numbers of players from N choosing resources r and r respectively. Likewise, nx 3 and nx 4 are the numbers of players from N choosing resources r and r 3 respectively. For x (x, x, x 3, x 4, the welfare for each resource in R is given by W (x ex e, W (x ex +x 3 e, W 3(x e.5x 4 e. and the system welfare for state x is given by W (x W (x + W (x + W 3 (x. (9 By defining each agent s utility as its marginal contribution to the system welfare, i.e., ( ( U i (a i, a i W x(a i, a i W x(, a i, the resulting game will be a semi-anonymous potential game with potential function W. It is straightforward to verify that the allocation that optimizes the system welfare is when a i r for all i N, i.e., x opt (0, /, /, 0. We simulated our proposed algorithm with rate α /σ starting from an inefficient Nash equilibrium x ne (/, 0, 0, /. Using Theorem, we examine the time it takes for the total variation distance between the empirical frequency and the stationary distribution to drop below / 0.05, i.e., the smallest t such that ν(t π T V : ν x(t π x 0.05, (0 x where the empirical frequency for a given state x is defined by ν x(t t I{x(τ x}. t τ The indicator function I returns if the state x(τ is x and 0 otherwise, so that component ν x (t is the percentage of time from τ to τ t that the system has been in state x. Setting β 50 ensures that E π W (a max x W (a /, so that, when ν(t π T V /, E ν(t W (a max x W (a per Theorem. Simulation results are shown in Figure for an average over 00 simulations with n ranging from 4 to 00. Average convergence times are bounded below by 7n log log n and above by 00n log log n; when n > 3 average convergence rates are bounded above by n log log n. These results support our main theorem. V. CONCLUSION We have extended the results of [] to define dynamics for a class of semi-anonymous limited population

4 Time to reach within 0.05 of the stationary distribution Growth of convergence time with the number of players Average onvergence time over 00 simulations 7 n log log n n log log n 00 n log log n Number of players, n Fig. : Average convergence times with respect to number of players potential games whose player utility functions may be written as functions of aggregate behavior within each population. For games with a small number of fixed allowable actions and a small number of fixed populations, the time it takes to come arbitrarily close to a potential function maximizer is linear in the number of players. REFERENCES [] C. Alós-Ferrer and N. Netzer. The logit-response dynamics. Games and Economic Behavior, 68(:43 47, 00. [] G. Arslan, J. R. Marden, and J. S. Shamma. Autonomous vehicletarget assignment: a game theoretical formulation. ASME Journal of Dynamic Systems, Measurement and Control, 9(5: , 007. [3] L. E. Blume. The statistical mechanics of strategic interaction. Games and Economic Behavior, 993. [4] G. Ellison. Basins of attraction, long-run stochastic stability, and the speed of step-by-step evolution. The Review of Economic Studies, 000. [5] D. P. Foster and H. P. Young. On the Nonconvergence of Fictitious Play in Coordination Games. Games and Economic Behavior, 5(:79 96, October 998. [6] Michael J. Fox and Jeff S. Shamma. Communication, convergence, and stochastic stability in self-assembly. In 49th IEEE Conference on Decision and Control (CDC, December 00. [7] A. Frieze and R. Kannan. Log-Sobolev inequalities and sampling from log-concave distributions. The Annals of Applied Probability, 998. [8] T. Goto, T. Hatanaka, and M. Fujita. Potential game theoretic attitude coordination on the circle: Synchronization and balanced circular formation. 00 IEEE International Symposium on Intelligent Control, pages 34 39, September 00. [9] G. Grimmett and D. Stirzaker. Probability and Random Processes. Oxfort University Press, New York, 3 edition, 00. [0] S. Hart and Y. Mansour. How long to equilibrium? The communication complexity of uncoupled equilibrium procedures. Games and Economic Behavior, 69:07 6, May 00. [] S. Hart and A. Mas-Colell. A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68(5:7 50, 003. [] M. Kandori, G. J. Mailath, and R. Rob. Learning, Mutation, and Long Run Equilibria in Games. Econometrica, 6(:9 56, 993. [3] G. E. Kreindler and H. P. Young. Fast convergence in evolutionary equilibrium selection. 0. [4] J. R. Marden, G. Arslan, and J. S. Shamma. Joint strategy fictitious play with inertia for potential games [5] J. R. Marden and A. Wierman. Distributed Welfare Games. Operations Research, (007: 5, 008. [6] D. Monderer and L. S. Shapley. Fictitious Play Property for Games with Identical Interests. Journal of Economic Theory, (68:58 65, 996. [7] D. Monderer and L. S. Shapley. Potential games. Games and Economic Behavior, 43(December 988:4 43, 996. [8] A. Montanari and A. Saberi. The spread of innovations in social networks. Proceedings of the National Academy of Sciences, pages , 00. [9] R. Montenegro and P. Tetali. Mathematical Aspects of Mixing Times in Markov Chains, volume. Now Pub., 006. [0] J. R. Norris. Markov Chains. Cambridge University Press, Cambridge, 997. [] R. W. Rosenthal. A Class of Games Possessing Pure-Strategy Nash Equilibria. International Journal of Game Theory, pages 3, 973. [] D. Shah and J. Shin. Dynamics in Congestion Games. In ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 00. [3] M. Staudigl. Stochastic stability in asymmetric binary choice coordination games. Games and Economic Behavior, 75(:37 40, May 0. [4] D. H. Wolpert and K. Tumer. Optimal Payoff Functions for Members of Collectives. Advances in Complex Systems, 4:65 79, 00. [5] M. hu and S. Martinez. Distributed coverage games for mobile visual sensors (I: Reaching the set of Nash equilibria. In Joint 48th IEEE Conference on Decision and Control and 8th Chinese Control Conference, 009. APPENDI A. Markov chain preliminaries The following Markov chain preliminaries are taken from [0], []. Let { k : k N} be a discrete time Markov chain over finite state space Ω with probability transition matrix M, so that the distribution of k at time k N satisfies µ(k µ(0m k ( A measure π on Ω is called a stationary distribution of M if π x 0, x Ω, and π πm ( A Markov chain with transition matrix M is irreducible if, for all x, y Ω, there exists k 0 such that M k (x, y > 0. The chain is aperiodic if, for all x, y Ω, gcd{k : M k (x, y > 0}. An irreducible and aperiodic Markov chain has a unique stationary distribution, π, such that lim k µ(0m k (x, y π for any initial distribution µ(0. Finally, M is reversible with distribution π if π x M(x, y π y M(x, y for all x, y Ω. If M is reversible with distribution π, then π is a stationary distribution for M [9]. A continuous time Markov chain, { t } t0, over finite state space Ω may be written in terms of a corresponding discrete time chain with transition matrix M [9]: µ(t µ(0e t(m I µ(0e t k0 t k M k, t 0 (3 k We refer to matrix M as the kernel of the process t. The following definitions and theorems regarding distances between measures and mixing times are taken

5 from [9], []. Let µ, ν be measures on finite state space Ω. Total variation distance is defined as µ ν T V : µ x ν x, (4 x Ω and the relative entropy (Kullback-Leibler divergence is defined as D(µ : ν : x Ω µ x log µ x ν x. (5 Then, D(µ : ν µ ν T V (6 For a continuous time Markov chain with kernel M and stationary distribution π, the distribution µ(t obeys D(µ(t : π e 4tρ(M D(µ(0 : π, t 0 (7 where ρ(m is the Sobolev constant of M, defined by with E(f, f : ρ(m : E(f, f inf f:ω R : L(f L(f 0 (8 (f(x f(y M(x, yπ x (9 x,y Ω L(f : E π log f E π f. (0 Here E π denotes the expectation with respect to distribution π. Then, for a Markov chain with initial distribution µ(0 and stationary distribution π, the total variation and relative entropy mixing times are T T V ( : min{ µ(t π } t ( T D ( : min{d(µ(t : π } t ( respectively. From [9], Corollary.6 and Remark., ( T D ( log log, (3 4ρ(M π min + log where π min : min x Ω π x. Applying (6, T T V ( T D ( 4ρ(M ( log log + log π min (4 Hence, a lower bound on the Sobolev constant ρ(m provides an upper bound on the mixing time for Markov chain. B. Markov chain defined by modified log linear learning The following Markov chain M over space is the kernel of the continuous time modified log linear learning process. Let e j i R si be the jth standard basis vector of length s i for j {,..., s i }. From x (x Ni.x N i, transition to the next state as follows: Choose a population N i {N, N,..., N m } with probability s i /σ Choose an action a j N i A Ni with probability /s i. If x j N i > 0, choose a player p {p N i : a p a j N i } uniformly at random to update according to (4, i.e., transition to (x Ni + n (ek i ej i, x N i with probability e βφ(x N i + n (ek i ej i,x N i si l eβφ(x N i + n (ei l ej i,x N i for each k {,,..., s i }. Using this to define transition probabilities in M, for any allowable transition from state x (x Ni, x N i to state y (x Ni + n (ek i ej i, x N i in which a player from population N i updates his action, M(x, y si σ s i {p N i : a p a j N i } p {p N i : a pa j N i } e βφ x Ni + n (ek i ej i,x N i σ P s i l eβφ(x N i + n (el i ej i,x N i e βφ x Ni + n (ek i ej i,x N i P si l eβφ(x N i + n (el i ej i,x N i (5 For a transition of any other form, M(x, y 0. Applying (3 to the chain with kernel M and global clock rate ασn, modified log linear learning evolves as µ(t µ(0e ασnt(m I. (6 C. Theorem supporting lemmas To prove Theorem we require three supporting lemmas. With Lemma we accomplish two tasks. First we establish the stationary distribution for modified log linear learning as a function of β. Second, we characterize how large β must be to ensure that the expected value of the potential function is within / of the potential function maximizer. Lemma establishes a lower bound on the Sobolev constant for the Markov chain M. A lower bound for the Sobolev constant associated with a given Markov chain provides an upper bound on its mixing time, as in Equation (4. Finally, Lemma 3 relates the Sobolev constant to the mixing time to within / of the stationary distribution for the continuous time modified log linear learning process. The lemmas and method of proof for Theorem follows the structure of the supporting lemmas and proof for Theorem 3 in []. Lemma : For the potential game G (N, A i, U i described in Section II with state space and potential function φ : R the stationary distribution under modified log linear learning is π x e βφ(x, x (7

6 Moreover, if φ is λ-lipschitz, i.e., φ(x φ(y λ x y, x, y, and j 4m(s β max log ms, then 4m(s log 8msλ ff (8 E π [φ(x] sup φ(x /. (9 x The proof of this lemma follows closely with the structure of the proof in [] of Lemma 6. Proof: Using Equation (5 with π x e βφ(x z eβφ(z, we see that π x M(x, y π y M(y, x for all x, y. Hence M is reversible with distribution π, implying that π is its stationary distribution. For the second part of the proof, define the following: L β E π[φ(x], C β x e βφ(x, x arg max φ(x, B(x, δ {x : x x δ} x where δ [0, ] is constant. Because π is of exponential form with normalization factor C β, the derivative of log C β with respect to β is L β. Moreover, L β is monotonically increasing in β; thus we may use the Mean Value Theorem as follows: L β β (log C β log C 0 β log C β β log x eβφ(x φ(x + β log x eβ(φ(x φ(x (a φ(x + β log x B(x,δ e βδλ φ(x + β log B(x, δ e βδλ φ(x δλ + ( B(x β log, δ where (a is from the fact that φ is λ-lipschitz and the definition of B(x, δ. Using intermediate results in the proof of Lemma 6 of [], B(x, δ and are bounded as follows: m ( si B(x δ(ni +, δ, and (30 ms i i m (n i + si i To see the lower bound for B(x, δ, begin with state x (x N i, x N i and perform the following steps to obtain a new state, x (x Ni, x N i. Step : For each i {,,..., m}, j {,,... s i } either 0 x j N i + δ(k+ n ms i n i /n for all k {0,,..., n i }. or 0 x j N i δ(k+ n ms i n i /n for all k {0,,..., n i }. Step : Performing operations satisfying step for each N i and each j {,,..., s i } and then adding or subtracting an appropriate to each x si N i so that s i k xk N i n i /n results in a state x satisfying x x δ. ( si The term δ(ni+ ms i counts the number of ways to do steps and within a single population N i. The product in (30 counts the number of ways to perform any variation of the above steps over all populations {N, N,..., N m }. This counting argument provides a lower bound on the size of B(x, δ as it clearly does not account for all states within the set. Now, L β φ(x δλ + ( B(x β log, δ si ( m δ(ni+ φ(x δλ + β log i ms i m i (n i + si φ(x δλ + m β log ( δ ms i φ(x δλ + i m(s β log si ( δ ms Consider two cases: (i λ /4, and (ii λ > /4. For case (i, choose δ and let β 4m(s log ms. Then, ( L β φ(x m(s δ δλ + log β ms φ(x m(s /4 log ms β In [], the size of the set B(x, δ for a single population is bounded by (δ(n + /s s, which we extend to m populations using the product above. Similarly, the size of the single population state space is bounded by (n+ s, which we extend to the multiple population state space.

7 φ(x m(s /4 log ms 4m(s log ms φ(x / For case (ii, note that λ > /4 δ π x e βφ(x /4λ < so we may choose δ /4λ. Let β 4m(s log ( π 8λms x y eβφ(y. Then ( L β φ(x m(s δ eβφ(x δλ + log y eβφ(y β ms φ(x m(s ( Since φ(x [0, ] for all x, /4 + log β 8λms ( e β φ(x m(s 8λms /4 4m(s log ( log 8λms so that φ(x / as desired. Lemma : For the Markov chain M defined in Section B, if φ : [0, ] and m + m i n i σ, then ρ(m e 3β c m(m(s n for some constant c which depends only on s. Proof: Using the technique of [], we compare the Sobolev constants for the chain M and a similar random walk on a convex set. We follow these steps: Step : Let M be the Markov chain M as defined in Appendix B with β 0. Let π be the stationary distribution of M. We use comparison techniques to lower bound the Sobolev constant ρ(m in terms of ρ(m, obtaining the bound ρ(m e 3β ρ(m. Step : Define M by restricting transitions within M to movement to and from certain holding actions. We use comparison techniques to lower bound ρ(m in terms of ρ(m, obtaining the bound ρ(m s ρ(m Step 3: We consider a random walk over a convex set C which is equivalent to M. For the random walk, we lower bound E(f, f and upper bound L(f, f for any function f : C R. This yields a lower bound on its Sobolev constant per (8, and hence on ρ(m. Step 4: Finally, we combine the string of inequalities to obtain a bound on the Sobolev constant ρ(m, ρ(m e 3β ρ(m e 3β s ρ(m e 3β As(σ m P e 3β m i n i c m(m(s n, where A is a constant, and c is a constant depending only on s. Step : M to M : Note that in M an updating agent chooses his next action uniformly at random. Per Equation (7 with β 0, the stationary distribution π of M is the uniform distribution over. Let x, y. We bound π x /πx and M(x, y/m (x, y to use Corollary 3.5 in [9]: y e0 e 0 e βφ(x eβ y eβφ(y e β π x πx e β (3 Similarly, for y (x Ni + n (ek i ej i, x N i, M(x, y M (x, y Since φ(x [0, ] for all x, s i e βφ(y si l eβφ(x N i + n (el i ej i,x N i e β s i e βφ(y si l eβφ(x N i + n (el i ej i,x N i eβ. (3 Then, for any x, y of the above form, e β M(x, y M (x, y eβ. For a transition to any y not of the form above, M(x, y M (x, y 0. Using this fact and Equations (3 and (3, we apply Corollary 3.5 in [9] to lower bound ρ(m : ρ(m e 3β ρ(m. (33 Step : M to M : Next, consider the Markov chain M on, where transitions from state x to y occur as follows: Choose a population N i with probability s i /σ Choose j {,..., s i } and choose κ {, }, each uniformly at random. If κ and x j N i > 0, then y (x Ni + n (ei s i e i j, x N i If κ and x si N i > 0, then y (x Ni + n (ei j e i s i, x N i That is, for each population N i, action a si i is treated as a holding bin; allowable transitions in M involve a single agent moving into or out of his population s holding bin. Since M (x, y M (y, x for any x, y, M is reversible with the uniform distribution over. Hence the stationary distribution is uniform, and π π. For a transition x to y in which an agent from population N i changes his action, M (x, y s i M (x, y,

8 implying M (x, y s M (x, y, x, y (34 since s s i, i {,..., m}. Using (34 and the fact that π π, we use Corollary 3.5 from [9] to bound ρ(m : ρ(m s ρ(m (35 Step 3: M to a random walk: The following random walk on ( C (z,..., z m σ m + : z i s i z j i ni, i +, si j is equivalent to M. For x C, transition to y as follows: Choose i [σ m] and κ {, }, each uniformly { at random. x + κei if x + κe y i C x otherwise The stationary distribution of this random walk is uniform, i.e., π x /, x C. Our aim is to lower bound the Sobolev constant, ρ(m, which, using the above steps, provides a lower bound on ρ(m and hence an upper bound for the mixing time of our modified log linear learning algorithm. We first recall the definition of the Sobolev constant for random walk M : ρ(m E(f, f : inf (36 f:c R : L(f L(f 0 Let g : C R to be an arbitrary function. To lower bound ρ(m, we will first lower bound E(g, g and then upper bound L(g. The ratio of these bounds then serves as a lower bound for the ratio E(g, g/l(g. Since this bound was established for an arbitrary function from C to R, it lower bounds the infimum over all such functions, i.e., the Sobolev constant. In the forthcoming analysis, we take advantage of a theorem due to [7] which applies to an extension of a given function g : C R to a function defined over the convex hull of C; here we formally define this extension. Let K be the convex hull of C. Given g : C R, we follow the procedure of [7], [] to extend g to a function g : K R. For x C, let C(x and C(x, be the σ m dimensional cubes of center x and sides and respectively. For sufficiently small > 0 and z C(x, define g : K R by: j g(x if z C(x, g (z (+η(zg(x+( η(zg(y otherwise (37 where y C is a point such that D C(x C(y is the closest face of C(x to z (if more than one y satisfy this condition, one such point may be chosen arbitrarily, and η dist(z,d [0,. The dist function represents standard Euclidean distance in R σ m. Define I : J : K K g (z dz (38 g (z log g(z vol(k dz. (39 RK g(y dy Applying Theorem of [7] for K R σ m with diameter m i n i, if m + m i n i σ, I J A m. (40 i n i We lower bound E(g, g in terms of I and then upper bound L(g in terms of J to obtain a lower bound on their ratio with Equation (40. Since these bounds were derived for an arbitrary function g : C R, the desired lower bound on the Sobolev constant follows. Lower bounding E(g, g in terms of I, I g (z dz K g (z dz C(x g(x dz C(x, + ( + η(zg(x + ( η(zg(y C(x\ C(x, C(x\ ( + η(zg(x y C: x y y C: x y y C: x y y C: x y y C: x y C(x, C(x\ C(x, C(x\ C(x, 0 y C: x y + ( η(zg(y dz g(x + g(y+ «dz dist(z, D(g(x g(y dz (dist(z, D(g(x g(y dz «g(x g(y dz C(x\ C(x, «g(x g(y ( ( σ m «g(x g(y + O(

9 (a 4 (σ m x,y C (σ m E(g, g + O(, where (a uses the facts that π x M (x, y { (σ m (g(x g(y π xm (x, y + O( and if x y, y C 0 otherwise Then, I 0 (σ me(g, g, and hence I E(g, g 0 (σ m Before bounding L(g in terms of J, we claim: vol(k σ m (σ m (4 (4 To see this, note that for each x C, vol(k C(x σ m (σ m, as the smallest possible volume for this intersection occurs at a corner []. Moreover, for any x, y C, either C(x C(y or the two cubes intersect on a set of measure zero. Therefore, vol(k σ m (σ m as desired. Now we upper bound L(g in terms of J, which was defined in Equation (39. Letting ν x vol(c(x K J K 0 g (z log g(z vol(k RK g(y dy dz g(x vol(c(x K vol(k, g(x vol(k log P y C g(y vol(c(y K vol(k g(x g(x ν x log. Py C g(y ν y This implies J vol(k (a 0 g(x ν x log g(x P y C g(y ν y ν x g(x g(x log P g(x y C g(y ν y " ν x «min inf π x g(x log g(x g(x π x c>0 c vol(k + c min vol(c(x K " (c ν x «min inf π x g(x log g(x g(x π x c>0 c + c σ m (σ m min " (d (e π x σ m (σ m ν x π x inf c>0 g(x log g(x c σ m (σ m min ν x π x " g(x log π x + # g(x π x (σ m vol(k(σ m # g(x + c # # g(x P g(x y C g(y π y π xg(x g(x log P y C g(y π y (σ m vol(k(σ m L(f where (a - (e are due to the following: (a The cubes C(x cover K and as 0, C(x, C(x, so that g (z g(x for z C(x (b By taking the derivative of this expression with respect to c we see that the minimum occurs at c g(x ν x (c Apply (4 and the fact that min vol(c(x K. (d This infimum occurs at c g(x π x as opposed to g(x ν x which minimized (b (e The smallest possible value of the intersection C(x K occurs at a corner, so min vol(c(x K σ m (σ m, and hence min νx σ m vol(k(σ m. π x + g(x ν x " (b inf ν x g(x log g(x c>0 c " ν x min inf π x c>0 g(x «+ c π x g(x log g(x c π x + c max ν x # # g(x «Then L(g 0 (σ m (σ m J (43

10 Step 4: Combining inequalities: Recall inequalities (40, (4, and (43: I J A m i n i I E(g, g 0 (σ m L(g 0 (σ m (σ m J. Using these and the fact that g : C R was chosen arbitrarily, E(f, f L(f f : C R. Therefore, (σ m A(σ m(σ m P m, (44 i n i ρ(m E(f, f min f:c R L(f (σ m A(σ m(σ m m (45 i n i Combining equations (33, (35, and (45 which related the Sobolev constants of the Markov chains M, M, and M, ρ(m e 3β ρ(m e 3β s ρ(m e 3β (σ m As(σ m(σ m P m i n i (a e 3β ms Asm(ms m(ms m n e 3β ms c m (m(s n. where c is a constant depending only on s. Inequality (a uses the facts that ms σ and mn m i n i. Lemma 3: For the Markov chain defined by modified log linear learning with kernel M and stationary distribution π, if m + m i n i m i s i, then for all t ms c e 3β m(m(s n 4α log log(n + ms m + log β + log where c c (s is a constant depending only on s, µ(t π T V. Proof: The proof of Lemma 3 follows the derivation method for Equations (3 and (5 in []. By applying Equation (4 to the continuous Markov chain defined by modified log linear learning, with kernel M as defined in Section B and update rate at least αmn, T T V ( 4αmnρ(M log log π min + log «ms c e 3β m(m(s n 4α log log + log «π min (a ms c e 3β m(m(s n 4α log log e β + log «(b ms c e 3β m(m(s n 4α my log log (n i + si + log β + log i ms c e 3β m(m(s n 4α log log(n + ms m + log β + log for a constant c depending only on s where (a and (b are due to the following: Py x eβφ(y e βφ(x (a We use the fact that π e β since φ(x [0, ], (b We use the fact that m i (n i + si. D. Proof of Theorem eβ e βφ(x We now use the above lemmas to prove Theorem in a similar manner as in the proof of Theorem 3 in []. Proof: From Lemma, if if φ is λ-lipschitz, i.e., φ(x φ(y λ x y, x, y, and { 4m(s β max log ms, then 4m(s E π[φ(x] sup φ(x /. x log 8msλ } From Lemma 3, if m + m i n i m i s i, and then Then t ms c e 3β m(m(s n 4α log log(n + ms m + log β + log µ(t π T V /. E[φ(x(a(t] E µ(t [φ(x] E π[φ(x] µ(t π T V max x φ(x (a max x φ(x max x φ(x where (a is from the fact that φ(x [0, ].

Understanding the Influence of Adversaries in Distributed Systems

Understanding the Influence of Adversaries in Distributed Systems Understanding the Influence of Adversaries in Distributed Systems Holly P. Borowski and Jason R. Marden Abstract Transitioning from a centralized to a distributed decision-making strategy can create vulnerability

More information

Understanding the Influence of Adversaries in Distributed Systems

Understanding the Influence of Adversaries in Distributed Systems Understanding the Influence of Adversaries in Distributed Systems Holly P. Borowski and Jason R. Marden Abstract Transitioning from a centralized to a distributed decision-making strategy can create vulnerability

More information

Performance Tradeoffs in Distributed Control Systems

Performance Tradeoffs in Distributed Control Systems University of Colorado, Boulder CU Scholar Aerospace Engineering Sciences Graduate Theses & Dissertations Aerospace Engineering Sciences Spring 1-1-2016 Performance Tradeoffs in Distributed Control Systems

More information

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden 1 Selecting Efficient Correlated Equilibria Through Distributed Learning Jason R. Marden Abstract A learning rule is completely uncoupled if each player s behavior is conditioned only on his own realized

More information

Utility Design for Distributed Engineering Systems

Utility Design for Distributed Engineering Systems Utility Design for Distributed Engineering Systems Jason R. Marden University of Colorado at Boulder (joint work with Na Li, Caltech) University of Oxford June 7, 2011 Cooperative control Goal: Derive

More information

Achieving Pareto Optimality Through Distributed Learning

Achieving Pareto Optimality Through Distributed Learning 1 Achieving Pareto Optimality Through Distributed Learning Jason R. Marden, H. Peyton Young, and Lucy Y. Pao Abstract We propose a simple payoff-based learning rule that is completely decentralized, and

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

Connections Between Cooperative Control and Potential Games Illustrated on the Consensus Problem

Connections Between Cooperative Control and Potential Games Illustrated on the Consensus Problem Proceedings of the European Control Conference 2007 Kos, Greece, July 2-5, 2007 Connections Between Cooperative Control and Potential Games Illustrated on the Consensus Problem Jason R. Marden, Gürdal

More information

Generalized Efficiency Bounds In Distributed Resource Allocation

Generalized Efficiency Bounds In Distributed Resource Allocation 1 Generalized Efficiency Bounds In Distributed Resource Allocation Jason R. Marden Tim Roughgarden Abstract Game theory is emerging as a popular tool for distributed control of multiagent systems. To take

More information

Achieving Pareto Optimality Through Distributed Learning

Achieving Pareto Optimality Through Distributed Learning 1 Achieving Pareto Optimality Through Distributed Learning Jason R. Marden, H. Peyton Young, and Lucy Y. Pao Abstract We propose a simple payoff-based learning rule that is completely decentralized, and

More information

Decoupling Coupled Constraints Through Utility Design

Decoupling Coupled Constraints Through Utility Design 1 Decoupling Coupled Constraints Through Utility Design Na Li and Jason R. Marden Abstract The central goal in multiagent systems is to design local control laws for the individual agents to ensure that

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo September 6, 2011 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

Game Theoretic Learning in Distributed Control

Game Theoretic Learning in Distributed Control Game Theoretic Learning in Distributed Control Jason R. Marden Jeff S. Shamma November 1, 2016 May 11, 2017 (revised) Abstract In distributed architecture control problems, there is a collection of interconnected

More information

Security Against Impersonation Attacks in Distributed Systems

Security Against Impersonation Attacks in Distributed Systems 1 Security Against Impersonation Attacks in Distributed Systems Philip N. Brown, Holly P. Borowski, and Jason R. Marden Abstract In a multi-agent system, transitioning from a centralized to a distributed

More information

1 Equilibrium Comparisons

1 Equilibrium Comparisons CS/SS 241a Assignment 3 Guru: Jason Marden Assigned: 1/31/08 Due: 2/14/08 2:30pm We encourage you to discuss these problems with others, but you need to write up the actual homework alone. At the top of

More information

Evolutionary Game Theory: Overview and Recent Results

Evolutionary Game Theory: Overview and Recent Results Overviews: Evolutionary Game Theory: Overview and Recent Results William H. Sandholm University of Wisconsin nontechnical survey: Evolutionary Game Theory (in Encyclopedia of Complexity and System Science,

More information

Designing Games to Handle Coupled Constraints

Designing Games to Handle Coupled Constraints Designing Games to Handle Coupled Constraints Na Li and Jason R. Marden Abstract The central goal in multiagent systems is to design local control laws for the individual agents to ensure that the emergent

More information

Unified Convergence Proofs of Continuous-time Fictitious Play

Unified Convergence Proofs of Continuous-time Fictitious Play Unified Convergence Proofs of Continuous-time Fictitious Play Jeff S. Shamma and Gurdal Arslan Department of Mechanical and Aerospace Engineering University of California Los Angeles {shamma,garslan}@seas.ucla.edu

More information

1 Lattices and Tarski s Theorem

1 Lattices and Tarski s Theorem MS&E 336 Lecture 8: Supermodular games Ramesh Johari April 30, 2007 In this lecture, we develop the theory of supermodular games; key references are the papers of Topkis [7], Vives [8], and Milgrom and

More information

Lecture 14: Approachability and regret minimization Ramesh Johari May 23, 2007

Lecture 14: Approachability and regret minimization Ramesh Johari May 23, 2007 MS&E 336 Lecture 4: Approachability and regret minimization Ramesh Johari May 23, 2007 In this lecture we use Blackwell s approachability theorem to formulate both external and internal regret minimizing

More information

1. Introduction. 2. A Simple Model

1. Introduction. 2. A Simple Model . Introduction In the last years, evolutionary-game theory has provided robust solutions to the problem of selection among multiple Nash equilibria in symmetric coordination games (Samuelson, 997). More

More information

6.254 : Game Theory with Engineering Applications Lecture 8: Supermodular and Potential Games

6.254 : Game Theory with Engineering Applications Lecture 8: Supermodular and Potential Games 6.254 : Game Theory with Engineering Applications Lecture 8: Supermodular and Asu Ozdaglar MIT March 2, 2010 1 Introduction Outline Review of Supermodular Games Reading: Fudenberg and Tirole, Section 12.3.

More information

Achieving Pareto Optimality Through Distributed Learning

Achieving Pareto Optimality Through Distributed Learning 1 Achieving Pareto Optimality Through Distributed Learning Jason R. Marden, H. Peyton Young, and Lucy Y. Pao Abstract We propose a simple payoff-based learning rule that is completely decentralized, and

More information

Achieving Pareto Optimality Through Distributed Learning

Achieving Pareto Optimality Through Distributed Learning 1 Achieving Pareto Optimality Through Distributed Learning Jason R. Marden, H. Peyton Young, and Lucy Y. Pao Abstract We propose a simple payoff-based learning rule that is completely decentralized, and

More information

Network Games: Learning and Dynamics

Network Games: Learning and Dynamics Network Games: Learning and Dynamics Asu Ozdaglar Conference on Decision and Control (CDC) December 2008 Department of Electrical Engineering & Computer Science Massachusetts Institute of Technology, USA

More information

Retrospective Spectrum Access Protocol: A Completely Uncoupled Learning Algorithm for Cognitive Networks

Retrospective Spectrum Access Protocol: A Completely Uncoupled Learning Algorithm for Cognitive Networks Retrospective Spectrum Access Protocol: A Completely Uncoupled Learning Algorithm for Cognitive Networks Marceau Coupechoux, Stefano Iellamo, Lin Chen + TELECOM ParisTech (INFRES/RMS) and CNRS LTCI + University

More information

Population Games and Evolutionary Dynamics

Population Games and Evolutionary Dynamics Population Games and Evolutionary Dynamics William H. Sandholm The MIT Press Cambridge, Massachusetts London, England in Brief Series Foreword Preface xvii xix 1 Introduction 1 1 Population Games 2 Population

More information

A Modified Q-Learning Algorithm for Potential Games

A Modified Q-Learning Algorithm for Potential Games Preprints of the 19th World Congress The International Federation of Automatic Control A Modified Q-Learning Algorithm for Potential Games Yatao Wang Lacra Pavel Edward S. Rogers Department of Electrical

More information

Achieving Pareto Optimality Through Distributed Learning

Achieving Pareto Optimality Through Distributed Learning 1 Achieving Pareto Optimality Through Distributed Learning Jason R. Marden, H. Peyton Young, and Lucy Y. Pao Abstract We propose a simple payoff-based learning rule that is completely decentralized, and

More information

Overcoming the Limitations of Utility Design for Multiagent Systems

Overcoming the Limitations of Utility Design for Multiagent Systems 1 Overcoming the Limitations of Utility Design for Multiagent Systems Jason R. Marden Adam Wierman Abstract Cooperative control focuses on deriving desirable collective behavior in multiagent systems through

More information

Exponential Moving Average Based Multiagent Reinforcement Learning Algorithms

Exponential Moving Average Based Multiagent Reinforcement Learning Algorithms Exponential Moving Average Based Multiagent Reinforcement Learning Algorithms Mostafa D. Awheda Department of Systems and Computer Engineering Carleton University Ottawa, Canada KS 5B6 Email: mawheda@sce.carleton.ca

More information

Fast Convergence in Evolutionary Equilibrium Selection

Fast Convergence in Evolutionary Equilibrium Selection Fast Convergence in Evolutionary Equilibrium Selection Gabriel E Kreindler H Peyton Young September 26, 2011 Abstract Stochastic learning models provide sharp predictions about equilibrium selection when

More information

Worst-Case Efficiency Analysis of Queueing Disciplines

Worst-Case Efficiency Analysis of Queueing Disciplines Worst-Case Efficiency Analysis of Queueing Disciplines Damon Mosk-Aoyama and Tim Roughgarden Department of Computer Science, Stanford University, 353 Serra Mall, Stanford, CA 94305 Introduction Consider

More information

Coevolutionary Modeling in Networks 1/39

Coevolutionary Modeling in Networks 1/39 Coevolutionary Modeling in Networks Jeff S. Shamma joint work with Ibrahim Al-Shyoukh & Georgios Chasparis & IMA Workshop on Analysis and Control of Network Dynamics October 19 23, 2015 Jeff S. Shamma

More information

Walras-Bowley Lecture 2003

Walras-Bowley Lecture 2003 Walras-Bowley Lecture 2003 Sergiu Hart This version: September 2004 SERGIU HART c 2004 p. 1 ADAPTIVE HEURISTICS A Little Rationality Goes a Long Way Sergiu Hart Center for Rationality, Dept. of Economics,

More information

LEARNING IN CONCAVE GAMES

LEARNING IN CONCAVE GAMES LEARNING IN CONCAVE GAMES P. Mertikopoulos French National Center for Scientific Research (CNRS) Laboratoire d Informatique de Grenoble GSBE ETBC seminar Maastricht, October 22, 2015 Motivation and Preliminaries

More information

Designing Games for Distributed Optimization

Designing Games for Distributed Optimization Designing Games for Distributed Optimization Na Li and Jason R. Marden Abstract The central goal in multiagent systems is to design local control laws for the individual agents to ensure that the emergent

More information

Game-Theoretic Learning:

Game-Theoretic Learning: Game-Theoretic Learning: Regret Minimization vs. Utility Maximization Amy Greenwald with David Gondek, Amir Jafari, and Casey Marks Brown University University of Pennsylvania November 17, 2004 Background

More information

Computational Game Theory Spring Semester, 2005/6. Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg*

Computational Game Theory Spring Semester, 2005/6. Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg* Computational Game Theory Spring Semester, 2005/6 Lecture 5: 2-Player Zero Sum Games Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg* 1 5.1 2-Player Zero Sum Games In this lecture

More information

Fast Convergence in Evolutionary Equilibrium Selection 1

Fast Convergence in Evolutionary Equilibrium Selection 1 Fast Convergence in Evolutionary Equilibrium Selection 1 Gabriel E Kreindler H Peyton Young January 19, 2012 Abstract Stochastic selection models provide sharp predictions about equilibrium selection when

More information

Efficient Algorithms for Universal Portfolios

Efficient Algorithms for Universal Portfolios Efficient Algorithms for Universal Portfolios Adam Kalai CMU Department of Computer Science akalai@cs.cmu.edu Santosh Vempala y MIT Department of Mathematics and Laboratory for Computer Science vempala@math.mit.edu

More information

Large Deviations and Stochastic Stability in the Small Noise Double Limit

Large Deviations and Stochastic Stability in the Small Noise Double Limit Large Deviations and Stochastic Stability in the Small Noise Double Limit William H. Sandholm and Mathias Staudigl February 8, 2015 Abstract We consider a model of stochastic evolution under general noisy

More information

Learning Approaches to the Witsenhausen Counterexample From a View of Potential Games

Learning Approaches to the Witsenhausen Counterexample From a View of Potential Games Learning Approaches to the Witsenhausen Counterexample From a View of Potential Games Na Li, Jason R. Marden and Jeff S. Shamma Abstract Since Witsenhausen put forward his remarkable counterexample in

More information

Constructive Proof of the Fan-Glicksberg Fixed Point Theorem for Sequentially Locally Non-constant Multi-functions in a Locally Convex Space

Constructive Proof of the Fan-Glicksberg Fixed Point Theorem for Sequentially Locally Non-constant Multi-functions in a Locally Convex Space Constructive Proof of the Fan-Glicksberg Fixed Point Theorem for Sequentially Locally Non-constant Multi-functions in a Locally Convex Space Yasuhito Tanaka, Member, IAENG, Abstract In this paper we constructively

More information

6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games

6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games 6.254 : Game Theory with Engineering Applications Lecture 7: Asu Ozdaglar MIT February 25, 2010 1 Introduction Outline Uniqueness of a Pure Nash Equilibrium for Continuous Games Reading: Rosen J.B., Existence

More information

Local Interactions under Switching Costs

Local Interactions under Switching Costs Local Interactions under Switching Costs Ge Jiang Simon Weidenholzer September 016 Abstract We study the impact of switching costs on the long run outcome in coordination games played in the circular city

More information

Evolution & Learning in Games

Evolution & Learning in Games 1 / 28 Evolution & Learning in Games Econ 243B Jean-Paul Carvalho Lecture 5. Revision Protocols and Evolutionary Dynamics 2 / 28 Population Games played by Boundedly Rational Agents We have introduced

More information

Self-stabilizing uncoupled dynamics

Self-stabilizing uncoupled dynamics Self-stabilizing uncoupled dynamics Aaron D. Jaggard 1, Neil Lutz 2, Michael Schapira 3, and Rebecca N. Wright 4 1 U.S. Naval Research Laboratory, Washington, DC 20375, USA. aaron.jaggard@nrl.navy.mil

More information

Definition Existence CG vs Potential Games. Congestion Games. Algorithmic Game Theory

Definition Existence CG vs Potential Games. Congestion Games. Algorithmic Game Theory Algorithmic Game Theory Definitions and Preliminaries Existence of Pure Nash Equilibria vs. Potential Games (Rosenthal 1973) A congestion game is a tuple Γ = (N,R,(Σ i ) i N,(d r) r R) with N = {1,...,n},

More information

Large deviations and stochastic stability in the small noise double limit

Large deviations and stochastic stability in the small noise double limit Theoretical Economics 11 (2016), 279 355 1555-7561/20160279 Large deviations and stochastic stability in the small noise double limit William H. Sandholm Department of Economics, University of Wisconsin

More information

Game Theory and Control

Game Theory and Control Game Theory and Control Lecture 4: Potential games Saverio Bolognani, Ashish Hota, Maryam Kamgarpour Automatic Control Laboratory ETH Zürich 1 / 40 Course Outline 1 Introduction 22.02 Lecture 1: Introduction

More information

Ergodicity and Non-Ergodicity in Economics

Ergodicity and Non-Ergodicity in Economics Abstract An stochastic system is called ergodic if it tends in probability to a limiting form that is independent of the initial conditions. Breakdown of ergodicity gives rise to path dependence. We illustrate

More information

Distributional stability and equilibrium selection Evolutionary dynamics under payoff heterogeneity (II) Dai Zusai

Distributional stability and equilibrium selection Evolutionary dynamics under payoff heterogeneity (II) Dai Zusai Distributional stability Dai Zusai 1 Distributional stability and equilibrium selection Evolutionary dynamics under payoff heterogeneity (II) Dai Zusai Tohoku University (Economics) August 2017 Outline

More information

Regret Based Dynamics: Convergence in Weakly Acyclic Games

Regret Based Dynamics: Convergence in Weakly Acyclic Games Regret Based Dynamics: Convergence in Weakly Acyclic Games Jason R. Marden Department of Mechanical and Aerospace Engineering University of California Los Angeles, CA 90095 marden@ucla.edu Gürdal Arslan

More information

Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics

Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Sergiu Hart August 2016 Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Sergiu Hart Center for the Study of Rationality

More information

On Equilibria of Distributed Message-Passing Games

On Equilibria of Distributed Message-Passing Games On Equilibria of Distributed Message-Passing Games Concetta Pilotto and K. Mani Chandy California Institute of Technology, Computer Science Department 1200 E. California Blvd. MC 256-80 Pasadena, US {pilotto,mani}@cs.caltech.edu

More information

Brown s Original Fictitious Play

Brown s Original Fictitious Play manuscript No. Brown s Original Fictitious Play Ulrich Berger Vienna University of Economics, Department VW5 Augasse 2-6, A-1090 Vienna, Austria e-mail: ulrich.berger@wu-wien.ac.at March 2005 Abstract

More information

Game Theory and its Applications to Networks - Part I: Strict Competition

Game Theory and its Applications to Networks - Part I: Strict Competition Game Theory and its Applications to Networks - Part I: Strict Competition Corinne Touati Master ENS Lyon, Fall 200 What is Game Theory and what is it for? Definition (Roger Myerson, Game Theory, Analysis

More information

Noncooperative Games, Couplings Constraints, and Partial Effi ciency

Noncooperative Games, Couplings Constraints, and Partial Effi ciency Noncooperative Games, Couplings Constraints, and Partial Effi ciency Sjur Didrik Flåm University of Bergen, Norway Background Customary Nash equilibrium has no coupling constraints. Here: coupling constraints

More information

Preliminary Results on Social Learning with Partial Observations

Preliminary Results on Social Learning with Partial Observations Preliminary Results on Social Learning with Partial Observations Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar ABSTRACT We study a model of social learning with partial observations from

More information

arxiv: v1 [cs.sy] 13 Sep 2017

arxiv: v1 [cs.sy] 13 Sep 2017 On imitation dynamics in potential population games Lorenzo Zino, Giacomo Como, and Fabio Fagnani arxiv:1709.04748v1 [cs.sy] 13 Sep 017 Abstract Imitation dynamics for population games are studied and

More information

p-dominance and Equilibrium Selection under Perfect Foresight Dynamics

p-dominance and Equilibrium Selection under Perfect Foresight Dynamics p-dominance and Equilibrium Selection under Perfect Foresight Dynamics Daisuke Oyama Graduate School of Economics, University of Tokyo Hongo, Bunkyo-ku, Tokyo 113-0033, Japan doyama@grad.e.u-tokyo.ac.jp

More information

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication)

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication) Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication) Nikolaus Robalino and Arthur Robson Appendix B: Proof of Theorem 2 This appendix contains the proof of Theorem

More information

Monotonic ɛ-equilibria in strongly symmetric games

Monotonic ɛ-equilibria in strongly symmetric games Monotonic ɛ-equilibria in strongly symmetric games Shiran Rachmilevitch April 22, 2016 Abstract ɛ-equilibrium allows for worse actions to be played with higher probability than better actions. I introduce

More information

Spatial Economics and Potential Games

Spatial Economics and Potential Games Outline Spatial Economics and Potential Games Daisuke Oyama Graduate School of Economics, Hitotsubashi University Hitotsubashi Game Theory Workshop 2007 Session Potential Games March 4, 2007 Potential

More information

The Game of Normal Numbers

The Game of Normal Numbers The Game of Normal Numbers Ehud Lehrer September 4, 2003 Abstract We introduce a two-player game where at each period one player, say, Player 2, chooses a distribution and the other player, Player 1, a

More information

Strategic Properties of Heterogeneous Serial Cost Sharing

Strategic Properties of Heterogeneous Serial Cost Sharing Strategic Properties of Heterogeneous Serial Cost Sharing Eric J. Friedman Department of Economics, Rutgers University New Brunswick, NJ 08903. January 27, 2000 Abstract We show that serial cost sharing

More information

Cooperative Control via Congestion Game Approach

Cooperative Control via Congestion Game Approach IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL XX, NO Y, MONTH 201Z 1 Cooperative Control via Congestion Game Approach Yaqi Hao, Sisi Pan, Yupeng Qiao, and Daizhan Cheng, Fellow, IEEE arxiv:171202504v1 [mathoc]

More information

Rate of Convergence of Learning in Social Networks

Rate of Convergence of Learning in Social Networks Rate of Convergence of Learning in Social Networks Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar Massachusetts Institute of Technology Cambridge, MA Abstract We study the rate of convergence

More information

Fast convergence in evolutionary models: A Lyapunov approach

Fast convergence in evolutionary models: A Lyapunov approach Fast convergence in evolutionary models: A Lyapunov approach Glenn Ellison, Drew Fudenberg, Lorens A. Imhof January 6, 015 Abstract Evolutionary models in which players are repeatedly matched to play a

More information

Efficient Sensor Network Planning Method. Using Approximate Potential Game

Efficient Sensor Network Planning Method. Using Approximate Potential Game Efficient Sensor Network Planning Method 1 Using Approximate Potential Game Su-Jin Lee, Young-Jin Park, and Han-Lim Choi, Member, IEEE arxiv:1707.00796v1 [cs.gt] 4 Jul 2017 Abstract This paper addresses

More information

The price of anarchy of finite congestion games

The price of anarchy of finite congestion games The price of anarchy of finite congestion games George Christodoulou Elias Koutsoupias Abstract We consider the price of anarchy of pure Nash equilibria in congestion games with linear latency functions.

More information

Stochastic Evolutionary Game Dynamics: Foundations, Deterministic Approximation, and Equilibrium Selection

Stochastic Evolutionary Game Dynamics: Foundations, Deterministic Approximation, and Equilibrium Selection Stochastic Evolutionary Game Dynamics: Foundations, Deterministic Approximation, and Equilibrium Selection William H. Sandholm October 31, 2010 Abstract We present a general model of stochastic evolution

More information

Correlated Equilibria: Rationality and Dynamics

Correlated Equilibria: Rationality and Dynamics Correlated Equilibria: Rationality and Dynamics Sergiu Hart June 2010 AUMANN 80 SERGIU HART c 2010 p. 1 CORRELATED EQUILIBRIA: RATIONALITY AND DYNAMICS Sergiu Hart Center for the Study of Rationality Dept

More information

Mean-Field optimization problems and non-anticipative optimal transport. Beatrice Acciaio

Mean-Field optimization problems and non-anticipative optimal transport. Beatrice Acciaio Mean-Field optimization problems and non-anticipative optimal transport Beatrice Acciaio London School of Economics based on ongoing projects with J. Backhoff, R. Carmona and P. Wang Thera Stochastics

More information

Bayesian Persuasion Online Appendix

Bayesian Persuasion Online Appendix Bayesian Persuasion Online Appendix Emir Kamenica and Matthew Gentzkow University of Chicago June 2010 1 Persuasion mechanisms In this paper we study a particular game where Sender chooses a signal π whose

More information

Population Games and Evolutionary Dynamics

Population Games and Evolutionary Dynamics Population Games and Evolutionary Dynamics (MIT Press, 200x; draft posted on my website) 1. Population games 2. Revision protocols and evolutionary dynamics 3. Potential games and their applications 4.

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Survival of Dominated Strategies under Evolutionary Dynamics. Josef Hofbauer University of Vienna. William H. Sandholm University of Wisconsin

Survival of Dominated Strategies under Evolutionary Dynamics. Josef Hofbauer University of Vienna. William H. Sandholm University of Wisconsin Survival of Dominated Strategies under Evolutionary Dynamics Josef Hofbauer University of Vienna William H. Sandholm University of Wisconsin a hypnodisk 1 Survival of Dominated Strategies under Evolutionary

More information

Long Run Equilibrium in Discounted Stochastic Fictitious Play

Long Run Equilibrium in Discounted Stochastic Fictitious Play Long Run Equilibrium in Discounted Stochastic Fictitious Play Noah Williams * Department of Economics, University of Wisconsin - Madison E-mail: nmwilliams@wisc.edu Revised March 18, 2014 In this paper

More information

Information Choice in Macroeconomics and Finance.

Information Choice in Macroeconomics and Finance. Information Choice in Macroeconomics and Finance. Laura Veldkamp New York University, Stern School of Business, CEPR and NBER Spring 2009 1 Veldkamp What information consumes is rather obvious: It consumes

More information

Pseudo-Potential Games

Pseudo-Potential Games Pseudo-Potential Games Burkhard C. Schipper Department of Economics University of Bonn preliminary: July 2004 Abstract The notion of pseudo-potential game is due to Dubey, Haimanko and Zapechelnyuk (2002).

More information

Convergence and No-Regret in Multiagent Learning

Convergence and No-Regret in Multiagent Learning Convergence and No-Regret in Multiagent Learning Michael Bowling Department of Computing Science University of Alberta Edmonton, Alberta Canada T6G 2E8 bowling@cs.ualberta.ca Abstract Learning in a multiagent

More information

Distributed Learning based on Entropy-Driven Game Dynamics

Distributed Learning based on Entropy-Driven Game Dynamics Distributed Learning based on Entropy-Driven Game Dynamics Bruno Gaujal joint work with Pierre Coucheney and Panayotis Mertikopoulos Inria Aug., 2014 Model Shared resource systems (network, processors)

More information

Irrational behavior in the Brown von Neumann Nash dynamics

Irrational behavior in the Brown von Neumann Nash dynamics Irrational behavior in the Brown von Neumann Nash dynamics Ulrich Berger a and Josef Hofbauer b a Vienna University of Economics and Business Administration, Department VW 5, Augasse 2-6, A-1090 Wien,

More information

Asymptotics of Efficiency Loss in Competitive Market Mechanisms

Asymptotics of Efficiency Loss in Competitive Market Mechanisms Page 1 of 40 Asymptotics of Efficiency Loss in Competitive Market Mechanisms Jia Yuan Yu and Shie Mannor Department of Electrical and Computer Engineering McGill University Montréal, Québec, Canada jia.yu@mail.mcgill.ca

More information

Deterministic Calibration and Nash Equilibrium

Deterministic Calibration and Nash Equilibrium University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2-2008 Deterministic Calibration and Nash Equilibrium Sham M. Kakade Dean P. Foster University of Pennsylvania Follow

More information

Large Deviations and Stochastic Stability in the Small Noise Double Limit, I: Theory

Large Deviations and Stochastic Stability in the Small Noise Double Limit, I: Theory March 204 505 Center for Mathematical Economics Working Papers Large Deviations and Stochastic Stability in the Small oise Double Limit, I: Theory Willian H. Sandholm and Mathias Staudigl Center for Mathematical

More information

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R. Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions

More information

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Gabriel Y. Weintraub, Lanier Benkard, and Benjamin Van Roy Stanford University {gweintra,lanierb,bvr}@stanford.edu Abstract

More information

Volume 35, Issue 2. Cournot Equilibrium Uniqueness in Case of Concave Industry Revenue: a Simple Proof

Volume 35, Issue 2. Cournot Equilibrium Uniqueness in Case of Concave Industry Revenue: a Simple Proof Volume 35, Issue 2 Cournot Equilibrium Uniqueness in Case of Concave Industry Revenue: a Simple Proof Pierre von Mouche Wageningen Universiteit Federico Quartieri Universita degli studi di Napoli Federico

More information

Combinatorial Agency of Threshold Functions

Combinatorial Agency of Threshold Functions Combinatorial Agency of Threshold Functions Shaili Jain 1 and David C. Parkes 2 1 Yale University, New Haven, CT shaili.jain@yale.edu 2 Harvard University, Cambridge, MA parkes@eecs.harvard.edu Abstract.

More information

FORMULATION OF THE LEARNING PROBLEM

FORMULATION OF THE LEARNING PROBLEM FORMULTION OF THE LERNING PROBLEM MIM RGINSKY Now that we have seen an informal statement of the learning problem, as well as acquired some technical tools in the form of concentration inequalities, we

More information

Approximate Nash Equilibria with Near Optimal Social Welfare

Approximate Nash Equilibria with Near Optimal Social Welfare Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 015) Approximate Nash Equilibria with Near Optimal Social Welfare Artur Czumaj, Michail Fasoulakis, Marcin

More information

Generalized Mirror Descents with Non-Convex Potential Functions in Atomic Congestion Games

Generalized Mirror Descents with Non-Convex Potential Functions in Atomic Congestion Games Generalized Mirror Descents with Non-Convex Potential Functions in Atomic Congestion Games Po-An Chen Institute of Information Management National Chiao Tung University, Taiwan poanchen@nctu.edu.tw Abstract.

More information

Threshold Policy for Global Games with Noisy Information Sharing

Threshold Policy for Global Games with Noisy Information Sharing 05 IEEE 54th Annual Conference on Decision and Control CDC December 5-8, 05 Osaka, Japan Threshold olicy for Global Games with Noisy Information Sharing Hessam Mahdavifar, Ahmad Beirami, Behrouz Touri,

More information

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45 Division of the Humanities and Social Sciences Supergradients KC Border Fall 2001 1 The supergradient of a concave function There is a useful way to characterize the concavity of differentiable functions.

More information

Lecture notes for Analysis of Algorithms : Markov decision processes

Lecture notes for Analysis of Algorithms : Markov decision processes Lecture notes for Analysis of Algorithms : Markov decision processes Lecturer: Thomas Dueholm Hansen June 6, 013 Abstract We give an introduction to infinite-horizon Markov decision processes (MDPs) with

More information

Mean Field Competitive Binary MDPs and Structured Solutions

Mean Field Competitive Binary MDPs and Structured Solutions Mean Field Competitive Binary MDPs and Structured Solutions Minyi Huang School of Mathematics and Statistics Carleton University Ottawa, Canada MFG217, UCLA, Aug 28 Sept 1, 217 1 / 32 Outline of talk The

More information

MDP Preliminaries. Nan Jiang. February 10, 2019

MDP Preliminaries. Nan Jiang. February 10, 2019 MDP Preliminaries Nan Jiang February 10, 2019 1 Markov Decision Processes In reinforcement learning, the interactions between the agent and the environment are often described by a Markov Decision Process

More information